2024-12-18T03:32:00.1552394Z Current runner version: '2.321.0' 2024-12-18T03:32:00.1558374Z Runner name: 'pytorch-rocm-hw-42' 2024-12-18T03:32:00.1559072Z Runner group name: 'linux.rocm.gpu.group' 2024-12-18T03:32:00.1559926Z Machine name: 'pytorch-rocm-hw-42' 2024-12-18T03:32:00.1562395Z ##[group]GITHUB_TOKEN Permissions 2024-12-18T03:32:00.1564456Z Contents: read 2024-12-18T03:32:00.1564964Z Metadata: read 2024-12-18T03:32:00.1565403Z ##[endgroup] 2024-12-18T03:32:00.1568278Z Secret source: Actions 2024-12-18T03:32:00.1568917Z Prepare workflow directory 2024-12-18T03:32:00.4426056Z Prepare all required actions 2024-12-18T03:32:00.4462092Z Getting action download info 2024-12-18T03:32:00.7945140Z Download action repository 'pytorch/pytorch@release/2.6' (SHA:0cdf8b1d09254cfda66191d1bd01e3041c3c76f7) 2024-12-18T03:32:07.8091956Z Download action repository 'aws-actions/configure-aws-credentials@v4' (SHA:e3dd6a429d7300a6a4c196c26e071d42e0343502) 2024-12-18T03:32:08.3558513Z Download action repository 'aws-actions/amazon-ecr-login@v2' (SHA:062b18b96a7aff071d4dc91bc00c4c1a7945b076) 2024-12-18T03:32:08.8228106Z Download action repository 'pytorch/test-infra@release/2.6' (SHA:eb0adf5a84668865394af69e26428b32c8105c1c) 2024-12-18T03:32:09.8127044Z Download action repository 'actions/upload-artifact@v4' (SHA:6f51ac03b9356f520e9adb1b1b7802705f340c2b) 2024-12-18T03:32:10.5285937Z Getting action download info 2024-12-18T03:32:10.7106947Z Download action repository 'malfet/checkout@silent-checkout' (SHA:e07af140b3ccefc05679e3755b9db68f4ee4589c) 2024-12-18T03:32:11.2844753Z Getting action download info 2024-12-18T03:32:11.4640021Z Download action repository 'nick-fields/retry@v3.0.0' (SHA:7152eba30c6575329ac0576536151aca5a72780e) 2024-12-18T03:32:11.9966035Z Uses: pytorch/pytorch/.github/workflows/_rocm-test.yml@refs/heads/release/2.6 (0cdf8b1d09254cfda66191d1bd01e3041c3c76f7) 2024-12-18T03:32:11.9967821Z ##[group] Inputs 2024-12-18T03:32:11.9968114Z build-environment: linux-focal-rocm6.2-py3.10 2024-12-18T03:32:11.9969333Z test-matrix: {"include": [{"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu", "owners": ["module:rocm", "oncall:distributed"]}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu", "owners": ["module:rocm", "oncall:distributed"]}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu", "owners": ["module:rocm", "oncall:distributed"]}]} 2024-12-18T03:32:11.9970683Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-rocm-n-py3:45e1356b47a284893081276eff3000b7b534f3b1 2024-12-18T03:32:11.9971180Z sync-tag: 2024-12-18T03:32:11.9971799Z timeout-minutes: 300 2024-12-18T03:32:11.9972011Z tests-to-include: 2024-12-18T03:32:11.9972476Z disable-monitor: true 2024-12-18T03:32:11.9972675Z ##[endgroup] 2024-12-18T03:32:11.9973077Z Complete job name: linux-focal-rocm6.2-py3.10 / test (distributed, 2, 3, linux.rocm.gpu, module:rocm, oncall:distributed) 2024-12-18T03:32:12.0966645Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@release/2.6 2024-12-18T03:32:12.0967277Z with: 2024-12-18T03:32:12.0967439Z no-sudo: true 2024-12-18T03:32:12.0967630Z submodules: recursive 2024-12-18T03:32:12.0967822Z fetch-depth: 0 2024-12-18T03:32:12.0968122Z env: 2024-12-18T03:32:12.0968285Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:32:12.0968485Z ##[endgroup] 2024-12-18T03:32:12.1038041Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2024-12-18T03:32:12.1038784Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2024-12-18T03:32:12.1069983Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:32:12.1070287Z env: 2024-12-18T03:32:12.1070459Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:32:12.1070659Z ##[endgroup] 2024-12-18T03:32:12.1298945Z ##[group]Run retry () { 2024-12-18T03:32:12.1299377Z retry () { 2024-12-18T03:32:12.1300238Z  $* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*) 2024-12-18T03:32:12.1300759Z } 2024-12-18T03:32:12.1301115Z echo "${GITHUB_WORKSPACE}" 2024-12-18T03:32:12.1301631Z if [ -z "${NO_SUDO}" ]; then 2024-12-18T03:32:12.1302187Z  retry sudo rm -rf "${GITHUB_WORKSPACE}" 2024-12-18T03:32:12.1302719Z else 2024-12-18T03:32:12.1303115Z  retry rm -rf "${GITHUB_WORKSPACE}" 2024-12-18T03:32:12.1303622Z fi 2024-12-18T03:32:12.1303998Z mkdir "${GITHUB_WORKSPACE}" 2024-12-18T03:32:12.1340803Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:32:12.1341423Z env: 2024-12-18T03:32:12.1341777Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:32:12.1342195Z NO_SUDO: true 2024-12-18T03:32:12.1342556Z ##[endgroup] 2024-12-18T03:32:12.1411263Z /home/pytorchci/actions-runner/_work/pytorch/pytorch 2024-12-18T03:32:15.8069359Z ##[group]Run malfet/checkout@silent-checkout 2024-12-18T03:32:15.8070030Z with: 2024-12-18T03:32:15.8070494Z ref: 0cdf8b1d09254cfda66191d1bd01e3041c3c76f7 2024-12-18T03:32:15.8071077Z fetch-depth: 0 2024-12-18T03:32:15.8071482Z submodules: recursive 2024-12-18T03:32:15.8071929Z quiet-checkout: true 2024-12-18T03:32:15.8072394Z repository: pytorch/pytorch 2024-12-18T03:32:15.8073104Z token: *** 2024-12-18T03:32:15.8073507Z ssh-strict: true 2024-12-18T03:32:15.8073958Z persist-credentials: true 2024-12-18T03:32:15.8074429Z clean: true 2024-12-18T03:32:15.8074864Z sparse-checkout-cone-mode: true 2024-12-18T03:32:15.8075380Z lfs: false 2024-12-18T03:32:15.8075763Z set-safe-directory: true 2024-12-18T03:32:15.8076745Z env: 2024-12-18T03:32:15.8077319Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:32:15.8077765Z ##[endgroup] 2024-12-18T03:32:15.8950360Z Syncing repository: pytorch/pytorch 2024-12-18T03:32:15.8952818Z ##[group]Getting Git version info 2024-12-18T03:32:15.8953764Z Working directory is '/home/pytorchci/actions-runner/_work/pytorch/pytorch' 2024-12-18T03:32:15.8954988Z [command]/usr/bin/git version 2024-12-18T03:32:15.8955483Z git version 2.34.1 2024-12-18T03:32:15.8957064Z ##[endgroup] 2024-12-18T03:32:15.8963583Z Temporarily overriding HOME='/home/pytorchci/actions-runner/_work/_temp/5adaf6b2-ae44-4693-bda1-2b488c6b0ed8' before making global git config changes 2024-12-18T03:32:15.8965271Z Adding repository directory to the temporary git global config as a safe directory 2024-12-18T03:32:15.8966611Z [command]/usr/bin/git config --global --add safe.directory /home/pytorchci/actions-runner/_work/pytorch/pytorch 2024-12-18T03:32:15.8968647Z Deleting the contents of '/home/pytorchci/actions-runner/_work/pytorch/pytorch' 2024-12-18T03:32:15.8969952Z ##[group]Initializing the repository 2024-12-18T03:32:15.8970828Z [command]/usr/bin/git init /home/pytorchci/actions-runner/_work/pytorch/pytorch 2024-12-18T03:32:15.8972497Z hint: Using 'master' as the name for the initial branch. This default branch name 2024-12-18T03:32:15.8973643Z hint: is subject to change. To configure the initial branch name to use in all 2024-12-18T03:32:15.8974779Z hint: of your new repositories, which will suppress this warning, call: 2024-12-18T03:32:15.8975490Z hint: 2024-12-18T03:32:15.8976010Z hint: git config --global init.defaultBranch 2024-12-18T03:32:15.8976723Z hint: 2024-12-18T03:32:15.8977401Z hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and 2024-12-18T03:32:15.8978409Z hint: 'development'. The just-created branch can be renamed via this command: 2024-12-18T03:32:15.8979146Z hint: 2024-12-18T03:32:15.8979534Z hint: git branch -m 2024-12-18T03:32:15.8980446Z Initialized empty Git repository in /home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/ 2024-12-18T03:32:15.8985013Z [command]/usr/bin/git remote add origin https://github.com/pytorch/pytorch 2024-12-18T03:32:15.9028487Z ##[endgroup] 2024-12-18T03:32:15.9028838Z ##[group]Disabling automatic garbage collection 2024-12-18T03:32:15.9030399Z [command]/usr/bin/git config --local gc.auto 0 2024-12-18T03:32:15.9051703Z ##[endgroup] 2024-12-18T03:32:15.9052014Z ##[group]Setting up auth 2024-12-18T03:32:15.9055735Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2024-12-18T03:32:15.9076629Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2024-12-18T03:32:15.9360938Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2024-12-18T03:32:15.9391550Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2024-12-18T03:32:15.9630974Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2024-12-18T03:32:15.9661822Z ##[endgroup] 2024-12-18T03:32:15.9662419Z ##[group]Fetching the repository 2024-12-18T03:32:15.9666278Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --progress --no-recurse-submodules --quiet origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2024-12-18T03:32:21.0910911Z remote: Enumerating objects: 1055653 2024-12-18T03:32:21.0911861Z remote: Enumerating objects: 1056590, done. 2024-12-18T03:32:21.0912671Z remote: Counting objects: 0% (1/937) 2024-12-18T03:32:21.0913390Z remote: Counting objects: 1% (10/937) 2024-12-18T03:32:21.0914137Z remote: Counting objects: 2% (19/937) 2024-12-18T03:32:21.0914850Z remote: Counting objects: 3% (29/937) 2024-12-18T03:32:21.0915541Z remote: Counting objects: 4% (38/937) 2024-12-18T03:32:21.0916252Z remote: Counting objects: 5% (47/937) 2024-12-18T03:32:21.0916947Z remote: Counting objects: 6% (57/937) 2024-12-18T03:32:21.0917616Z remote: Counting objects: 7% (66/937) 2024-12-18T03:32:21.0918239Z remote: Counting objects: 8% (75/937) 2024-12-18T03:32:21.0918832Z remote: Counting objects: 9% (85/937) 2024-12-18T03:32:21.0919415Z remote: Counting objects: 10% (94/937) 2024-12-18T03:32:21.0920010Z remote: Counting objects: 11% (104/937) 2024-12-18T03:32:21.0920622Z remote: Counting objects: 12% (113/937) 2024-12-18T03:32:21.0921243Z remote: Counting objects: 13% (122/937) 2024-12-18T03:32:21.0921852Z remote: Counting objects: 14% (132/937) 2024-12-18T03:32:21.0922455Z remote: Counting objects: 15% (141/937) 2024-12-18T03:32:21.0923039Z remote: Counting objects: 16% (150/937) 2024-12-18T03:32:21.0923640Z remote: Counting objects: 17% (160/937) 2024-12-18T03:32:21.0924259Z remote: Counting objects: 18% (169/937) 2024-12-18T03:32:21.0924858Z remote: Counting objects: 19% (179/937) 2024-12-18T03:32:21.0926355Z remote: Counting objects: 20% (188/937) 2024-12-18T03:32:21.0927017Z remote: Counting objects: 21% (197/937) 2024-12-18T03:32:21.0927616Z remote: Counting objects: 22% (207/937) 2024-12-18T03:32:21.0928204Z remote: Counting objects: 23% (216/937) 2024-12-18T03:32:21.0928795Z remote: Counting objects: 24% (225/937) 2024-12-18T03:32:21.0929390Z remote: Counting objects: 25% (235/937) 2024-12-18T03:32:21.0929971Z remote: Counting objects: 26% (244/937) 2024-12-18T03:32:21.0930679Z remote: Counting objects: 27% (253/937) 2024-12-18T03:32:21.0931303Z remote: Counting objects: 28% (263/937) 2024-12-18T03:32:21.0932001Z remote: Counting objects: 29% (272/937) 2024-12-18T03:32:21.0932701Z remote: Counting objects: 30% (282/937) 2024-12-18T03:32:21.0933406Z remote: Counting objects: 31% (291/937) 2024-12-18T03:32:21.0934096Z remote: Counting objects: 32% (300/937) 2024-12-18T03:32:21.0935496Z remote: Counting objects: 33% (310/937) 2024-12-18T03:32:21.0936102Z remote: Counting objects: 34% (319/937) 2024-12-18T03:32:21.0937075Z remote: Counting objects: 35% (328/937) 2024-12-18T03:32:21.0937670Z remote: Counting objects: 36% (338/937) 2024-12-18T03:32:21.0938263Z remote: Counting objects: 37% (347/937) 2024-12-18T03:32:21.0938844Z remote: Counting objects: 38% (357/937) 2024-12-18T03:32:21.0939435Z remote: Counting objects: 39% (366/937) 2024-12-18T03:32:21.0940061Z remote: Counting objects: 40% (375/937) 2024-12-18T03:32:21.0940640Z remote: Counting objects: 41% (385/937) 2024-12-18T03:32:21.0941239Z remote: Counting objects: 42% (394/937) 2024-12-18T03:32:21.0941842Z remote: Counting objects: 43% (403/937) 2024-12-18T03:32:21.0942423Z remote: Counting objects: 44% (413/937) 2024-12-18T03:32:21.0943012Z remote: Counting objects: 45% (422/937) 2024-12-18T03:32:21.0943614Z remote: Counting objects: 46% (432/937) 2024-12-18T03:32:21.0944201Z remote: Counting objects: 47% (441/937) 2024-12-18T03:32:21.0944805Z remote: Counting objects: 48% (450/937) 2024-12-18T03:32:21.0945400Z remote: Counting objects: 49% (460/937) 2024-12-18T03:32:21.0945981Z remote: Counting objects: 50% (469/937) 2024-12-18T03:32:21.0946573Z remote: Counting objects: 51% (478/937) 2024-12-18T03:32:21.0947163Z remote: Counting objects: 52% (488/937) 2024-12-18T03:32:21.0947749Z remote: Counting objects: 53% (497/937) 2024-12-18T03:32:21.0948335Z remote: Counting objects: 54% (506/937) 2024-12-18T03:32:21.0948924Z remote: Counting objects: 55% (516/937) 2024-12-18T03:32:21.0949510Z remote: Counting objects: 56% (525/937) 2024-12-18T03:32:21.0950088Z remote: Counting objects: 57% (535/937) 2024-12-18T03:32:21.0950677Z remote: Counting objects: 58% (544/937) 2024-12-18T03:32:21.0951270Z remote: Counting objects: 59% (553/937) 2024-12-18T03:32:21.0951852Z remote: Counting objects: 60% (563/937) 2024-12-18T03:32:21.0952447Z remote: Counting objects: 61% (572/937) 2024-12-18T03:32:21.0953044Z remote: Counting objects: 62% (581/937) 2024-12-18T03:32:21.0953668Z remote: Counting objects: 63% (591/937) 2024-12-18T03:32:21.0954267Z remote: Counting objects: 64% (600/937) 2024-12-18T03:32:21.0954858Z remote: Counting objects: 65% (610/937) 2024-12-18T03:32:21.0955446Z remote: Counting objects: 66% (619/937) 2024-12-18T03:32:21.0956036Z remote: Counting objects: 67% (628/937) 2024-12-18T03:32:21.0956626Z remote: Counting objects: 68% (638/937) 2024-12-18T03:32:21.0957211Z remote: Counting objects: 69% (647/937) 2024-12-18T03:32:21.0957804Z remote: Counting objects: 70% (656/937) 2024-12-18T03:32:21.0958396Z remote: Counting objects: 71% (666/937) 2024-12-18T03:32:21.0958985Z remote: Counting objects: 72% (675/937) 2024-12-18T03:32:21.0959913Z remote: Counting objects: 73% (685/937) 2024-12-18T03:32:21.0960533Z remote: Counting objects: 74% (694/937) 2024-12-18T03:32:21.0961121Z remote: Counting objects: 75% (703/937) 2024-12-18T03:32:21.0961707Z remote: Counting objects: 76% (713/937) 2024-12-18T03:32:21.0962296Z remote: Counting objects: 77% (722/937) 2024-12-18T03:32:21.0962888Z remote: Counting objects: 78% (731/937) 2024-12-18T03:32:21.0963480Z remote: Counting objects: 79% (741/937) 2024-12-18T03:32:21.0964072Z remote: Counting objects: 80% (750/937) 2024-12-18T03:32:21.0964658Z remote: Counting objects: 81% (759/937) 2024-12-18T03:32:21.0965241Z remote: Counting objects: 82% (769/937) 2024-12-18T03:32:21.0965835Z remote: Counting objects: 83% (778/937) 2024-12-18T03:32:21.0966426Z remote: Counting objects: 84% (788/937) 2024-12-18T03:32:21.1442674Z remote: Counting objects: 85% (797/937) 2024-12-18T03:32:21.1443494Z remote: Counting objects: 86% (806/937) 2024-12-18T03:32:21.1444827Z remote: Counting objects: 87% (816/937) 2024-12-18T03:32:21.1445454Z remote: Counting objects: 88% (825/937) 2024-12-18T03:32:21.1446058Z remote: Counting objects: 89% (834/937) 2024-12-18T03:32:21.1446647Z remote: Counting objects: 90% (844/937) 2024-12-18T03:32:21.1447238Z remote: Counting objects: 91% (853/937) 2024-12-18T03:32:21.1447827Z remote: Counting objects: 92% (863/937) 2024-12-18T03:32:21.1448412Z remote: Counting objects: 93% (872/937) 2024-12-18T03:32:21.1448992Z remote: Counting objects: 94% (881/937) 2024-12-18T03:32:21.1449573Z remote: Counting objects: 95% (891/937) 2024-12-18T03:32:21.1450157Z remote: Counting objects: 96% (900/937) 2024-12-18T03:32:21.1450742Z remote: Counting objects: 97% (909/937) 2024-12-18T03:32:21.1451331Z remote: Counting objects: 98% (919/937) 2024-12-18T03:32:21.1451935Z remote: Counting objects: 99% (928/937) 2024-12-18T03:32:21.1452536Z remote: Counting objects: 100% (937/937) 2024-12-18T03:32:21.1453190Z remote: Counting objects: 100% (937/937), done. 2024-12-18T03:32:21.1453852Z remote: Compressing objects: 0% (1/399) 2024-12-18T03:32:21.1454503Z remote: Compressing objects: 1% (4/399) 2024-12-18T03:32:21.1455291Z remote: Compressing objects: 2% (8/399) 2024-12-18T03:32:21.1980373Z remote: Compressing objects: 3% (12/399) 2024-12-18T03:32:21.1981151Z remote: Compressing objects: 4% (16/399) 2024-12-18T03:32:21.2517836Z remote: Compressing objects: 5% (20/399) 2024-12-18T03:32:21.2518613Z remote: Compressing objects: 6% (24/399) 2024-12-18T03:32:21.5113528Z remote: Compressing objects: 7% (28/399) 2024-12-18T03:32:21.5114330Z remote: Compressing objects: 8% (32/399) 2024-12-18T03:32:21.5114998Z remote: Compressing objects: 9% (36/399) 2024-12-18T03:32:21.5115645Z remote: Compressing objects: 10% (40/399) 2024-12-18T03:32:21.5116282Z remote: Compressing objects: 11% (44/399) 2024-12-18T03:32:21.5116987Z remote: Compressing objects: 12% (48/399) 2024-12-18T03:32:21.5117599Z remote: Compressing objects: 13% (52/399) 2024-12-18T03:32:21.5118200Z remote: Compressing objects: 14% (56/399) 2024-12-18T03:32:21.5118805Z remote: Compressing objects: 15% (60/399) 2024-12-18T03:32:21.5119447Z remote: Compressing objects: 16% (64/399) 2024-12-18T03:32:21.5120063Z remote: Compressing objects: 17% (68/399) 2024-12-18T03:32:21.5120677Z remote: Compressing objects: 18% (72/399) 2024-12-18T03:32:21.5121279Z remote: Compressing objects: 19% (76/399) 2024-12-18T03:32:21.5121878Z remote: Compressing objects: 20% (80/399) 2024-12-18T03:32:21.5122486Z remote: Compressing objects: 21% (84/399) 2024-12-18T03:32:21.5123542Z remote: Compressing objects: 22% (88/399) 2024-12-18T03:32:21.5124178Z remote: Compressing objects: 23% (92/399) 2024-12-18T03:32:21.5124797Z remote: Compressing objects: 24% (96/399) 2024-12-18T03:32:21.5125420Z remote: Compressing objects: 25% (100/399) 2024-12-18T03:32:21.5188770Z remote: Compressing objects: 26% (104/399) 2024-12-18T03:32:21.5189588Z remote: Compressing objects: 27% (108/399) 2024-12-18T03:32:21.5190271Z remote: Compressing objects: 28% (112/399) 2024-12-18T03:32:21.5190887Z remote: Compressing objects: 29% (116/399) 2024-12-18T03:32:21.5191497Z remote: Compressing objects: 30% (120/399) 2024-12-18T03:32:21.5192121Z remote: Compressing objects: 31% (124/399) 2024-12-18T03:32:21.5192722Z remote: Compressing objects: 32% (128/399) 2024-12-18T03:32:21.5193325Z remote: Compressing objects: 33% (132/399) 2024-12-18T03:32:21.5193925Z remote: Compressing objects: 34% (136/399) 2024-12-18T03:32:21.5194549Z remote: Compressing objects: 35% (140/399) 2024-12-18T03:32:21.5195165Z remote: Compressing objects: 36% (144/399) 2024-12-18T03:32:21.5196272Z remote: Compressing objects: 37% (148/399) 2024-12-18T03:32:21.5196886Z remote: Compressing objects: 38% (152/399) 2024-12-18T03:32:21.5197496Z remote: Compressing objects: 39% (156/399) 2024-12-18T03:32:21.5198098Z remote: Compressing objects: 40% (160/399) 2024-12-18T03:32:21.5198693Z remote: Compressing objects: 41% (164/399) 2024-12-18T03:32:21.5199301Z remote: Compressing objects: 42% (168/399) 2024-12-18T03:32:21.5199906Z remote: Compressing objects: 43% (172/399) 2024-12-18T03:32:21.5200521Z remote: Compressing objects: 44% (176/399) 2024-12-18T03:32:21.5201120Z remote: Compressing objects: 45% (180/399) 2024-12-18T03:32:21.5201726Z remote: Compressing objects: 46% (184/399) 2024-12-18T03:32:21.5202323Z remote: Compressing objects: 47% (188/399) 2024-12-18T03:32:21.5202929Z remote: Compressing objects: 48% (192/399) 2024-12-18T03:32:21.5203545Z remote: Compressing objects: 49% (196/399) 2024-12-18T03:32:21.5273665Z remote: Compressing objects: 50% (200/399) 2024-12-18T03:32:21.5274465Z remote: Compressing objects: 51% (204/399) 2024-12-18T03:32:21.5275128Z remote: Compressing objects: 52% (208/399) 2024-12-18T03:32:21.5275753Z remote: Compressing objects: 53% (212/399) 2024-12-18T03:32:21.5276364Z remote: Compressing objects: 54% (216/399) 2024-12-18T03:32:21.5276983Z remote: Compressing objects: 55% (220/399) 2024-12-18T03:32:21.5277584Z remote: Compressing objects: 56% (224/399) 2024-12-18T03:32:21.5278178Z remote: Compressing objects: 57% (228/399) 2024-12-18T03:32:21.5278787Z remote: Compressing objects: 58% (232/399) 2024-12-18T03:32:21.5279396Z remote: Compressing objects: 59% (236/399) 2024-12-18T03:32:21.5280023Z remote: Compressing objects: 60% (240/399) 2024-12-18T03:32:21.5280636Z remote: Compressing objects: 61% (244/399) 2024-12-18T03:32:21.5281268Z remote: Compressing objects: 62% (248/399) 2024-12-18T03:32:21.5281920Z remote: Compressing objects: 63% (252/399) 2024-12-18T03:32:21.5282557Z remote: Compressing objects: 64% (256/399) 2024-12-18T03:32:21.5283180Z remote: Compressing objects: 65% (260/399) 2024-12-18T03:32:21.5283789Z remote: Compressing objects: 66% (264/399) 2024-12-18T03:32:21.5284400Z remote: Compressing objects: 67% (268/399) 2024-12-18T03:32:21.5285011Z remote: Compressing objects: 68% (272/399) 2024-12-18T03:32:21.5285609Z remote: Compressing objects: 69% (276/399) 2024-12-18T03:32:21.5286222Z remote: Compressing objects: 70% (280/399) 2024-12-18T03:32:21.5311356Z remote: Compressing objects: 71% (284/399) 2024-12-18T03:32:21.5312149Z remote: Compressing objects: 72% (288/399) 2024-12-18T03:32:21.5313269Z remote: Compressing objects: 73% (292/399) 2024-12-18T03:32:21.5313990Z remote: Compressing objects: 74% (296/399) 2024-12-18T03:32:21.5314624Z remote: Compressing objects: 75% (300/399) 2024-12-18T03:32:21.5315229Z remote: Compressing objects: 76% (304/399) 2024-12-18T03:32:21.5315838Z remote: Compressing objects: 77% (308/399) 2024-12-18T03:32:21.5316449Z remote: Compressing objects: 78% (312/399) 2024-12-18T03:32:21.5317048Z remote: Compressing objects: 79% (316/399) 2024-12-18T03:32:21.5317677Z remote: Compressing objects: 80% (320/399) 2024-12-18T03:32:21.5318334Z remote: Compressing objects: 81% (324/399) 2024-12-18T03:32:21.5318950Z remote: Compressing objects: 82% (328/399) 2024-12-18T03:32:21.5319568Z remote: Compressing objects: 83% (332/399) 2024-12-18T03:32:21.5320353Z remote: Compressing objects: 84% (336/399) 2024-12-18T03:32:21.5320950Z remote: Compressing objects: 85% (340/399) 2024-12-18T03:32:21.5321566Z remote: Compressing objects: 86% (344/399) 2024-12-18T03:32:21.5322552Z remote: Compressing objects: 87% (348/399) 2024-12-18T03:32:21.5323152Z remote: Compressing objects: 88% (352/399) 2024-12-18T03:32:21.5323765Z remote: Compressing objects: 89% (356/399) 2024-12-18T03:32:21.5324369Z remote: Compressing objects: 90% (360/399) 2024-12-18T03:32:21.5324962Z remote: Compressing objects: 91% (364/399) 2024-12-18T03:32:21.5325578Z remote: Compressing objects: 92% (368/399) 2024-12-18T03:32:21.5326190Z remote: Compressing objects: 93% (372/399) 2024-12-18T03:32:21.5326820Z remote: Compressing objects: 94% (376/399) 2024-12-18T03:32:21.5327417Z remote: Compressing objects: 95% (380/399) 2024-12-18T03:32:21.5328027Z remote: Compressing objects: 96% (384/399) 2024-12-18T03:32:21.5328647Z remote: Compressing objects: 97% (388/399) 2024-12-18T03:32:21.5329264Z remote: Compressing objects: 98% (392/399) 2024-12-18T03:32:21.5329874Z remote: Compressing objects: 99% (396/399) 2024-12-18T03:32:21.5330495Z remote: Compressing objects: 100% (399/399) 2024-12-18T03:32:21.5331167Z remote: Compressing objects: 100% (399/399), done. 2024-12-18T03:32:46.0302581Z remote: Total 1056590 (delta 759), reused 548 (delta 538), pack-reused 1055653 (from 5) 2024-12-18T03:32:56.2641061Z [command]/usr/bin/git rev-parse --verify --quiet 0cdf8b1d09254cfda66191d1bd01e3041c3c76f7^{object} 2024-12-18T03:32:56.2668046Z 0cdf8b1d09254cfda66191d1bd01e3041c3c76f7 2024-12-18T03:32:56.2673840Z ##[endgroup] 2024-12-18T03:32:56.2674616Z ##[group]Determining the checkout info 2024-12-18T03:32:56.2675960Z ##[endgroup] 2024-12-18T03:32:56.2676644Z ##[group]Checking out the ref 2024-12-18T03:32:56.2683169Z [command]/usr/bin/git checkout --quiet --force 0cdf8b1d09254cfda66191d1bd01e3041c3c76f7 2024-12-18T03:32:57.6695537Z ##[endgroup] 2024-12-18T03:32:57.6696382Z ##[group]Setting up auth for fetching submodules 2024-12-18T03:32:57.6704062Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2024-12-18T03:32:57.6776501Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2024-12-18T03:32:57.6810731Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2024-12-18T03:32:57.6842307Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2024-12-18T03:32:57.6874855Z ##[endgroup] 2024-12-18T03:32:57.6880974Z ##[group]Fetching submodules 2024-12-18T03:32:57.6881689Z [command]/usr/bin/git submodule sync --recursive 2024-12-18T03:32:57.7170487Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2024-12-18T03:32:57.7421770Z Submodule 'android/libs/fbjni' (https://github.com/facebookincubator/fbjni.git) registered for path 'android/libs/fbjni' 2024-12-18T03:32:57.7423948Z Submodule 'third_party/NNPACK_deps/FP16' (https://github.com/Maratyszcza/FP16.git) registered for path 'third_party/FP16' 2024-12-18T03:32:57.7427270Z Submodule 'third_party/NNPACK_deps/FXdiv' (https://github.com/Maratyszcza/FXdiv.git) registered for path 'third_party/FXdiv' 2024-12-18T03:32:57.7430476Z Submodule 'third_party/NNPACK' (https://github.com/Maratyszcza/NNPACK.git) registered for path 'third_party/NNPACK' 2024-12-18T03:32:57.7433660Z Submodule 'third_party/NVTX' (https://github.com/NVIDIA/NVTX.git) registered for path 'third_party/NVTX' 2024-12-18T03:32:57.7437181Z Submodule 'third_party/VulkanMemoryAllocator' (https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.git) registered for path 'third_party/VulkanMemoryAllocator' 2024-12-18T03:32:57.7440301Z Submodule 'third_party/XNNPACK' (https://github.com/google/XNNPACK.git) registered for path 'third_party/XNNPACK' 2024-12-18T03:32:57.7443765Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/benchmark' 2024-12-18T03:32:57.7447062Z Submodule 'third_party/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/composable_kernel' 2024-12-18T03:32:57.7450598Z Submodule 'third_party/cpp-httplib' (https://github.com/yhirose/cpp-httplib.git) registered for path 'third_party/cpp-httplib' 2024-12-18T03:32:57.7454140Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo.git) registered for path 'third_party/cpuinfo' 2024-12-18T03:32:57.7458546Z Submodule 'third_party/cudnn_frontend' (https://github.com/NVIDIA/cudnn-frontend.git) registered for path 'third_party/cudnn_frontend' 2024-12-18T03:32:57.7462197Z Submodule 'third_party/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/cutlass' 2024-12-18T03:32:57.7465977Z Submodule 'third_party/eigen' (https://gitlab.com/libeigen/eigen.git) registered for path 'third_party/eigen' 2024-12-18T03:32:57.7469647Z Submodule 'third_party/fbgemm' (https://github.com/pytorch/fbgemm) registered for path 'third_party/fbgemm' 2024-12-18T03:32:57.7473605Z Submodule 'third_party/flatbuffers' (https://github.com/google/flatbuffers.git) registered for path 'third_party/flatbuffers' 2024-12-18T03:32:57.7477478Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/fmt' 2024-12-18T03:32:57.7481657Z Submodule 'third_party/gemmlowp/gemmlowp' (https://github.com/google/gemmlowp.git) registered for path 'third_party/gemmlowp/gemmlowp' 2024-12-18T03:32:57.7485723Z Submodule 'third_party/gloo' (https://github.com/facebookincubator/gloo) registered for path 'third_party/gloo' 2024-12-18T03:32:57.7489927Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/googletest' 2024-12-18T03:32:57.7494000Z Submodule 'third_party/ideep' (https://github.com/intel/ideep) registered for path 'third_party/ideep' 2024-12-18T03:32:57.7498636Z Submodule 'third_party/ittapi' (https://github.com/intel/ittapi.git) registered for path 'third_party/ittapi' 2024-12-18T03:32:57.7502891Z Submodule 'third_party/kineto' (https://github.com/pytorch/kineto) registered for path 'third_party/kineto' 2024-12-18T03:32:57.7507371Z Submodule 'third_party/mimalloc' (https://github.com/microsoft/mimalloc.git) registered for path 'third_party/mimalloc' 2024-12-18T03:32:57.7511802Z Submodule 'third_party/nccl/nccl' (https://github.com/NVIDIA/nccl) registered for path 'third_party/nccl/nccl' 2024-12-18T03:32:57.7516332Z Submodule 'third_party/nlohmann' (https://github.com/nlohmann/json.git) registered for path 'third_party/nlohmann' 2024-12-18T03:32:57.7521090Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx' 2024-12-18T03:32:57.7525730Z Submodule 'third_party/opentelemetry-cpp' (https://github.com/open-telemetry/opentelemetry-cpp.git) registered for path 'third_party/opentelemetry-cpp' 2024-12-18T03:32:57.7530505Z Submodule 'third_party/pocketfft' (https://github.com/mreineck/pocketfft) registered for path 'third_party/pocketfft' 2024-12-18T03:32:57.7535161Z Submodule 'third_party/protobuf' (https://github.com/protocolbuffers/protobuf.git) registered for path 'third_party/protobuf' 2024-12-18T03:32:57.7539981Z Submodule 'third_party/NNPACK_deps/psimd' (https://github.com/Maratyszcza/psimd.git) registered for path 'third_party/psimd' 2024-12-18T03:32:57.7544957Z Submodule 'third_party/NNPACK_deps/pthreadpool' (https://github.com/Maratyszcza/pthreadpool.git) registered for path 'third_party/pthreadpool' 2024-12-18T03:32:57.7549824Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/pybind11' 2024-12-18T03:32:57.7554955Z Submodule 'third_party/python-peachpy' (https://github.com/malfet/PeachPy.git) registered for path 'third_party/python-peachpy' 2024-12-18T03:32:57.7559918Z Submodule 'third_party/sleef' (https://github.com/shibatch/sleef) registered for path 'third_party/sleef' 2024-12-18T03:32:57.7565172Z Submodule 'third_party/tensorpipe' (https://github.com/pytorch/tensorpipe.git) registered for path 'third_party/tensorpipe' 2024-12-18T03:32:57.7636200Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/android/libs/fbjni'... 2024-12-18T03:33:00.7821832Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/FP16'... 2024-12-18T03:33:03.3940871Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/FXdiv'... 2024-12-18T03:33:05.8323224Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/NNPACK'... 2024-12-18T03:33:08.5497694Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/NVTX'... 2024-12-18T03:33:11.7673753Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/VulkanMemoryAllocator'... 2024-12-18T03:33:16.4972329Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/XNNPACK'... 2024-12-18T03:33:25.3807871Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/benchmark'... 2024-12-18T03:33:28.1903991Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/composable_kernel'... 2024-12-18T03:33:32.2433392Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/cpp-httplib'... 2024-12-18T03:33:35.0360647Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/cpuinfo'... 2024-12-18T03:33:37.9852864Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/cudnn_frontend'... 2024-12-18T03:33:41.8787240Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/cutlass'... 2024-12-18T03:33:46.2719857Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/eigen'... 2024-12-18T03:33:50.0083011Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/fbgemm'... 2024-12-18T03:33:53.8311507Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/flatbuffers'... 2024-12-18T03:33:57.1616146Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/fmt'... 2024-12-18T03:34:00.3530310Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/gemmlowp/gemmlowp'... 2024-12-18T03:34:02.9682284Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/gloo'... 2024-12-18T03:34:05.9918245Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/googletest'... 2024-12-18T03:34:09.5410056Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/ideep'... 2024-12-18T03:34:12.6226875Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/ittapi'... 2024-12-18T03:34:15.1665372Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/kineto'... 2024-12-18T03:34:18.5237852Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/mimalloc'... 2024-12-18T03:34:22.4309735Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/nccl/nccl'... 2024-12-18T03:34:26.1063067Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/nlohmann'... 2024-12-18T03:34:35.0845107Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/onnx'... 2024-12-18T03:34:39.7437648Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp'... 2024-12-18T03:34:45.6128361Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/pocketfft'... 2024-12-18T03:34:48.2815250Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/protobuf'... 2024-12-18T03:34:56.7284146Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/psimd'... 2024-12-18T03:34:59.7757081Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/pthreadpool'... 2024-12-18T03:35:02.7326060Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/pybind11'... 2024-12-18T03:35:06.4846574Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/python-peachpy'... 2024-12-18T03:35:09.6503143Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/sleef'... 2024-12-18T03:35:13.0828691Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe'... 2024-12-18T03:35:16.6307923Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2024-12-18T03:35:16.6569428Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2024-12-18T03:35:16.6802371Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2024-12-18T03:35:16.7226616Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2024-12-18T03:35:16.7743567Z Submodule path 'third_party/NVTX': checked out 'e170594ac7cf1dac584da473d4ca9301087090c1' 2024-12-18T03:35:16.8282018Z Submodule path 'third_party/VulkanMemoryAllocator': checked out 'a6bfc237255a6bac1513f7c1ebde6d8aed6b5191' 2024-12-18T03:35:17.5771615Z Submodule path 'third_party/XNNPACK': checked out '4ea82e595b36106653175dcb04b2aa532660d0d8' 2024-12-18T03:35:17.6208530Z Submodule path 'third_party/benchmark': checked out '0d98dba29d66e93259db7daa53a9327df767a415' 2024-12-18T03:35:17.8648920Z Submodule path 'third_party/composable_kernel': checked out '50ee4267e27b875d149e642f4cebd47be1dc3b57' 2024-12-18T03:35:17.9378321Z Submodule path 'third_party/cpp-httplib': checked out '3b6597bba913d51161383657829b7e644e59c006' 2024-12-18T03:35:18.0523122Z Submodule path 'third_party/cpuinfo': checked out '1e83a2fdd3102f65c6f1fb602c1b320486218a99' 2024-12-18T03:35:18.1097694Z Submodule path 'third_party/cudnn_frontend': checked out '936021bfed8c91dc416af1588b2c4eca631a9e45' 2024-12-18T03:35:18.6293240Z Submodule path 'third_party/cutlass': checked out 'bbe579a9e3beb6ea6626d9227ec32d0dae119a49' 2024-12-18T03:35:18.8822488Z Submodule path 'third_party/eigen': checked out '3147391d946bb4b6c68edd901f2add6ac1f31f8c' 2024-12-18T03:35:18.9898472Z Submodule path 'third_party/fbgemm': checked out 'dbc3157bf256f1339b3fa1fef2be89ac4078be0e' 2024-12-18T03:35:18.9965139Z Submodule 'third_party/asmjit' (https://github.com/asmjit/asmjit.git) registered for path 'third_party/fbgemm/third_party/asmjit' 2024-12-18T03:35:18.9967959Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo) registered for path 'third_party/fbgemm/third_party/cpuinfo' 2024-12-18T03:35:18.9971253Z Submodule 'third_party/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/fbgemm/third_party/cutlass' 2024-12-18T03:35:18.9974339Z Submodule 'third_party/googletest' (https://github.com/google/googletest) registered for path 'third_party/fbgemm/third_party/googletest' 2024-12-18T03:35:18.9980955Z Submodule 'third_party/hipify_torch' (https://github.com/ROCmSoftwarePlatform/hipify_torch.git) registered for path 'third_party/fbgemm/third_party/hipify_torch' 2024-12-18T03:35:19.0027823Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/asmjit'... 2024-12-18T03:35:22.4618414Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/cpuinfo'... 2024-12-18T03:35:26.6206076Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/cutlass'... 2024-12-18T03:35:31.3349001Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/googletest'... 2024-12-18T03:35:35.1006765Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/hipify_torch'... 2024-12-18T03:35:38.0512139Z Submodule path 'third_party/fbgemm/third_party/asmjit': checked out 'd3fbf7c9bc7c1d1365a94a45614b91c5a3706b81' 2024-12-18T03:35:38.1635079Z Submodule path 'third_party/fbgemm/third_party/cpuinfo': checked out 'ed8b86a253800bafdb7b25c5c399f91bff9cb1f3' 2024-12-18T03:35:38.5813362Z Submodule path 'third_party/fbgemm/third_party/cutlass': checked out 'fc9ebc645b63f3a6bc80aaefde5c063fb72110d6' 2024-12-18T03:35:38.6586352Z Submodule path 'third_party/fbgemm/third_party/googletest': checked out 'cbf019de22c8dd37b2108da35b2748fd702d1796' 2024-12-18T03:35:38.6865008Z Submodule path 'third_party/fbgemm/third_party/hipify_torch': checked out '23f53b025b466d8ec3c45d52290d3442f7fbe6b1' 2024-12-18T03:35:38.8305348Z Submodule path 'third_party/flatbuffers': checked out '01834de25e4bf3975a9a00e816292b1ad0fe184b' 2024-12-18T03:35:38.8883990Z Submodule path 'third_party/fmt': checked out '0c9fce2ffefecfdce794e1859584e25877b7b592' 2024-12-18T03:35:38.9462692Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2024-12-18T03:35:38.9908406Z Submodule path 'third_party/gloo': checked out '5354032ea08eadd7fc4456477f7f7c6308818509' 2024-12-18T03:35:39.0479738Z Submodule path 'third_party/googletest': checked out 'b514bdc898e2951020cbdca1304b75f5950d1f59' 2024-12-18T03:35:39.0790107Z Submodule path 'third_party/ideep': checked out 'c7ccd5bdbe5434ba156f4e856dcef0601637334b' 2024-12-18T03:35:39.0849708Z Submodule 'mkl-dnn' (https://github.com/intel/mkl-dnn.git) registered for path 'third_party/ideep/mkl-dnn' 2024-12-18T03:35:39.0897517Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/ideep/mkl-dnn'... 2024-12-18T03:35:52.9572679Z Submodule path 'third_party/ideep/mkl-dnn': checked out '66f0cb9eb66affd2da3bf5f8d897376f04aae6af' 2024-12-18T03:35:52.9990603Z Submodule path 'third_party/ittapi': checked out '5b8a7d7422611c3a0d799fb5fc5dd4abfae35b42' 2024-12-18T03:35:53.1018078Z Submodule path 'third_party/kineto': checked out '338140f58a28d599da3434ced4fd2d75dd1a213d' 2024-12-18T03:35:53.1110214Z Submodule 'libkineto/third_party/dynolog' (https://github.com/facebookincubator/dynolog.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog' 2024-12-18T03:35:53.1114626Z Submodule 'libkineto/third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/fmt' 2024-12-18T03:35:53.1115765Z Submodule 'libkineto/third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/googletest' 2024-12-18T03:35:53.1183820Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog'... 2024-12-18T03:35:56.7528450Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/fmt'... 2024-12-18T03:36:01.0236495Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/googletest'... 2024-12-18T03:36:05.7436685Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out '7d04a0053a845370ae06ce317a22a48e9edcc74e' 2024-12-18T03:36:05.7504174Z Submodule 'third_party/DCGM' (https://github.com/NVIDIA/DCGM.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-12-18T03:36:05.7509968Z Submodule 'third_party/cpr' (https://github.com/libcpr/cpr.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-12-18T03:36:05.7512343Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-12-18T03:36:05.7515793Z Submodule 'third_party/gflags' (https://github.com/gflags/gflags.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-12-18T03:36:05.7519916Z Submodule 'third_party/glog' (https://github.com/google/glog.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-12-18T03:36:05.7523732Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-12-18T03:36:05.7527170Z Submodule 'third_party/json' (https://github.com/nlohmann/json.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-12-18T03:36:05.7532736Z Submodule 'third_party/pfs' (https://github.com/dtrugman/pfs.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-12-18T03:36:05.7596660Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'... 2024-12-18T03:36:10.3050940Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'... 2024-12-18T03:36:13.8035898Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'... 2024-12-18T03:36:17.5898585Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'... 2024-12-18T03:36:21.6910708Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/glog'... 2024-12-18T03:36:25.8984520Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'... 2024-12-18T03:36:29.9858432Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/json'... 2024-12-18T03:36:40.0237506Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'... 2024-12-18T03:36:44.1545692Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2024-12-18T03:36:44.1886762Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2024-12-18T03:36:44.2424654Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2024-12-18T03:36:44.2708519Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2024-12-18T03:36:44.2756081Z Submodule 'doc' (https://github.com/gflags/gflags.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-12-18T03:36:44.2807156Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'... 2024-12-18T03:36:49.2165358Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2024-12-18T03:36:49.2505689Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2024-12-18T03:36:49.3053201Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '58d77fa8070e8cec2dc1ed015d66b454c8d78850' 2024-12-18T03:36:49.4229997Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2024-12-18T03:36:49.4585004Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2024-12-18T03:36:49.5145643Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '0041a40c1350ba702d475b9c4ad62da77caea164' 2024-12-18T03:36:49.5818708Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '7aca84427f224eeed3144123d5230d5871e93347' 2024-12-18T03:36:49.6368387Z Submodule path 'third_party/mimalloc': checked out 'b66e3214d8a104669c2ec05ae91ebc26a8f5ab78' 2024-12-18T03:36:49.6787513Z Submodule path 'third_party/nccl/nccl': checked out 'ab2b89c4c339bd7f816fbc114a4b05d386b66290' 2024-12-18T03:36:49.8040471Z Submodule path 'third_party/nlohmann': checked out '87cda1d6646592ac5866dc703c8e1839046a6806' 2024-12-18T03:36:50.2160013Z Submodule path 'third_party/onnx': checked out 'b8baa8446686496da4cc8fda09f2b6fe65c2a02c' 2024-12-18T03:36:50.2242860Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx/third_party/pybind11' 2024-12-18T03:36:50.2324477Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/pybind11'... 2024-12-18T03:36:55.8332907Z Submodule path 'third_party/onnx/third_party/pybind11': checked out '3e9dfa2866941655c56877882565e7577de6fc7b' 2024-12-18T03:36:55.9227883Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2024-12-18T03:36:55.9313581Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark) registered for path 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-12-18T03:36:55.9317801Z Submodule 'third_party/googletest' (https://github.com/google/googletest) registered for path 'third_party/opentelemetry-cpp/third_party/googletest' 2024-12-18T03:36:55.9320413Z Submodule 'third_party/ms-gsl' (https://github.com/microsoft/GSL) registered for path 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-12-18T03:36:55.9324794Z Submodule 'third_party/nlohmann-json' (https://github.com/nlohmann/json) registered for path 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-12-18T03:36:55.9327665Z Submodule 'third_party/opentelemetry-proto' (https://github.com/open-telemetry/opentelemetry-proto) registered for path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-12-18T03:36:55.9331457Z Submodule 'third_party/opentracing-cpp' (https://github.com/opentracing/opentracing-cpp.git) registered for path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-12-18T03:36:55.9335782Z Submodule 'third_party/prometheus-cpp' (https://github.com/jupp0r/prometheus-cpp) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-12-18T03:36:55.9339651Z Submodule 'tools/vcpkg' (https://github.com/Microsoft/vcpkg) registered for path 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-12-18T03:36:55.9387895Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/benchmark'... 2024-12-18T03:37:00.7358205Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/googletest'... 2024-12-18T03:37:06.3271654Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/ms-gsl'... 2024-12-18T03:37:11.8636018Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/nlohmann-json'... 2024-12-18T03:37:22.8029667Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/opentelemetry-proto'... 2024-12-18T03:37:28.2937588Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp'... 2024-12-18T03:37:33.8000385Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp'... 2024-12-18T03:37:39.9572553Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/tools/vcpkg'... 2024-12-18T03:37:52.9592154Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2024-12-18T03:37:53.0145058Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2024-12-18T03:37:53.0461236Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2024-12-18T03:37:53.1665602Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2024-12-18T03:37:53.2005813Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2024-12-18T03:37:53.2323496Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2024-12-18T03:37:53.2600842Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2024-12-18T03:37:53.2655748Z Submodule 'civetweb' (https://github.com/civetweb/civetweb.git) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-12-18T03:37:53.2659168Z Submodule 'googletest' (https://github.com/google/googletest.git) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-12-18T03:37:53.2709336Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'... 2024-12-18T03:37:59.9167275Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'... 2024-12-18T03:38:06.7760446Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2024-12-18T03:38:06.8404922Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2024-12-18T03:38:07.3484258Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2024-12-18T03:38:07.3826028Z Submodule path 'third_party/pocketfft': checked out '9d3ab05a7fffbc71a492bc6a17be034e83e8f0fe' 2024-12-18T03:38:07.6682440Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2024-12-18T03:38:07.6780380Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/protobuf/third_party/benchmark' 2024-12-18T03:38:07.6789215Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/protobuf/third_party/googletest' 2024-12-18T03:38:07.6862044Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/benchmark'... 2024-12-18T03:38:14.0431714Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/googletest'... 2024-12-18T03:38:20.8019765Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2024-12-18T03:38:20.8815652Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2024-12-18T03:38:20.9106894Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2024-12-18T03:38:20.9375321Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2024-12-18T03:38:20.9910551Z Submodule path 'third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2024-12-18T03:38:21.0414232Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2024-12-18T03:38:21.1026134Z Submodule path 'third_party/sleef': checked out '60e76d2bce17d278b439d9da17177c8f957a9e9b' 2024-12-18T03:38:21.1453447Z Submodule path 'third_party/tensorpipe': checked out '52791a2fd214b2a9dc5759d36725909c1daa7f2e' 2024-12-18T03:38:21.1514952Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/tensorpipe/third_party/googletest' 2024-12-18T03:38:21.1518310Z Submodule 'third_party/libnop' (https://github.com/google/libnop.git) registered for path 'third_party/tensorpipe/third_party/libnop' 2024-12-18T03:38:21.1521870Z Submodule 'third_party/libuv' (https://github.com/libuv/libuv.git) registered for path 'third_party/tensorpipe/third_party/libuv' 2024-12-18T03:38:21.1525679Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/tensorpipe/third_party/pybind11' 2024-12-18T03:38:21.1581755Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/googletest'... 2024-12-18T03:38:28.0643455Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libnop'... 2024-12-18T03:38:34.0902569Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libuv'... 2024-12-18T03:38:42.3850898Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11'... 2024-12-18T03:38:49.2411126Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2024-12-18T03:38:49.2713169Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2024-12-18T03:38:49.3447765Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '1dff88e5161cba5c59276d2070d2e304e4dcb242' 2024-12-18T03:38:49.3900112Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2024-12-18T03:38:49.3958452Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-12-18T03:38:49.4012565Z Cloning into '/home/pytorchci/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11/tools/clang'... 2024-12-18T03:38:56.4906486Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2024-12-18T03:38:56.5027603Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2024-12-18T03:38:56.5305314Z Entering 'android/libs/fbjni' 2024-12-18T03:38:56.5352792Z Entering 'third_party/FP16' 2024-12-18T03:38:56.5395763Z Entering 'third_party/FXdiv' 2024-12-18T03:38:56.5431593Z Entering 'third_party/NNPACK' 2024-12-18T03:38:56.5466261Z Entering 'third_party/NVTX' 2024-12-18T03:38:56.5503296Z Entering 'third_party/VulkanMemoryAllocator' 2024-12-18T03:38:56.5539546Z Entering 'third_party/XNNPACK' 2024-12-18T03:38:56.5586410Z Entering 'third_party/benchmark' 2024-12-18T03:38:56.5627588Z Entering 'third_party/composable_kernel' 2024-12-18T03:38:56.5673624Z Entering 'third_party/cpp-httplib' 2024-12-18T03:38:56.5712408Z Entering 'third_party/cpuinfo' 2024-12-18T03:38:56.5752517Z Entering 'third_party/cudnn_frontend' 2024-12-18T03:38:56.5791704Z Entering 'third_party/cutlass' 2024-12-18T03:38:56.5837629Z Entering 'third_party/eigen' 2024-12-18T03:38:56.5883075Z Entering 'third_party/fbgemm' 2024-12-18T03:38:56.5919426Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-12-18T03:38:56.5961454Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-12-18T03:38:56.5998632Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-12-18T03:38:56.6042756Z Entering 'third_party/fbgemm/third_party/googletest' 2024-12-18T03:38:56.6080945Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-12-18T03:38:56.6121083Z Entering 'third_party/flatbuffers' 2024-12-18T03:38:56.6164449Z Entering 'third_party/fmt' 2024-12-18T03:38:56.6203670Z Entering 'third_party/gemmlowp/gemmlowp' 2024-12-18T03:38:56.6239443Z Entering 'third_party/gloo' 2024-12-18T03:38:56.6277912Z Entering 'third_party/googletest' 2024-12-18T03:38:56.6316210Z Entering 'third_party/ideep' 2024-12-18T03:38:56.6352744Z Entering 'third_party/ideep/mkl-dnn' 2024-12-18T03:38:56.6400042Z Entering 'third_party/ittapi' 2024-12-18T03:38:56.6438489Z Entering 'third_party/kineto' 2024-12-18T03:38:56.6477599Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-12-18T03:38:56.6515454Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-12-18T03:38:56.6553671Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-12-18T03:38:56.6588447Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-12-18T03:38:56.6629853Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-12-18T03:38:56.6670695Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-12-18T03:38:56.6716146Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-12-18T03:38:56.6755740Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-12-18T03:38:56.6800917Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-12-18T03:38:56.6845218Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-12-18T03:38:56.6888655Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-12-18T03:38:56.6927858Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-12-18T03:38:56.6973858Z Entering 'third_party/mimalloc' 2024-12-18T03:38:56.7017855Z Entering 'third_party/nccl/nccl' 2024-12-18T03:38:56.7067968Z Entering 'third_party/nlohmann' 2024-12-18T03:38:56.7106351Z Entering 'third_party/onnx' 2024-12-18T03:38:56.7162451Z Entering 'third_party/onnx/third_party/pybind11' 2024-12-18T03:38:56.7207284Z Entering 'third_party/opentelemetry-cpp' 2024-12-18T03:38:56.7259319Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-12-18T03:38:56.7298745Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-12-18T03:38:56.7338848Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-12-18T03:38:56.7369288Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-12-18T03:38:56.7410394Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-12-18T03:38:56.7453466Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-12-18T03:38:56.7489293Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-12-18T03:38:56.7529408Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-12-18T03:38:56.7571533Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-12-18T03:38:56.7618788Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-12-18T03:38:56.7679734Z Entering 'third_party/pocketfft' 2024-12-18T03:38:56.7720240Z Entering 'third_party/protobuf' 2024-12-18T03:38:56.7763382Z Entering 'third_party/protobuf/third_party/benchmark' 2024-12-18T03:38:56.7806315Z Entering 'third_party/protobuf/third_party/googletest' 2024-12-18T03:38:56.7848047Z Entering 'third_party/psimd' 2024-12-18T03:38:56.7894226Z Entering 'third_party/pthreadpool' 2024-12-18T03:38:56.7939751Z Entering 'third_party/pybind11' 2024-12-18T03:38:56.7982673Z Entering 'third_party/python-peachpy' 2024-12-18T03:38:56.8021105Z Entering 'third_party/sleef' 2024-12-18T03:38:56.8061049Z Entering 'third_party/tensorpipe' 2024-12-18T03:38:56.8095587Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-12-18T03:38:56.8131482Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-12-18T03:38:56.8167675Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-12-18T03:38:56.8203718Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-12-18T03:38:56.8243300Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-12-18T03:38:56.8301459Z ##[endgroup] 2024-12-18T03:38:56.8304792Z ##[group]Persisting credentials for submodules 2024-12-18T03:38:56.8306633Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2024-12-18T03:38:56.8597832Z Entering 'android/libs/fbjni' 2024-12-18T03:38:56.8648257Z Entering 'third_party/FP16' 2024-12-18T03:38:56.8691929Z Entering 'third_party/FXdiv' 2024-12-18T03:38:56.8729462Z Entering 'third_party/NNPACK' 2024-12-18T03:38:56.8779174Z Entering 'third_party/NVTX' 2024-12-18T03:38:56.8816257Z Entering 'third_party/VulkanMemoryAllocator' 2024-12-18T03:38:56.8860159Z Entering 'third_party/XNNPACK' 2024-12-18T03:38:56.8908298Z Entering 'third_party/benchmark' 2024-12-18T03:38:56.8947640Z Entering 'third_party/composable_kernel' 2024-12-18T03:38:56.8993696Z Entering 'third_party/cpp-httplib' 2024-12-18T03:38:56.9032855Z Entering 'third_party/cpuinfo' 2024-12-18T03:38:56.9071039Z Entering 'third_party/cudnn_frontend' 2024-12-18T03:38:56.9110354Z Entering 'third_party/cutlass' 2024-12-18T03:38:56.9157429Z Entering 'third_party/eigen' 2024-12-18T03:38:56.9198632Z Entering 'third_party/fbgemm' 2024-12-18T03:38:56.9246718Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-12-18T03:38:56.9291824Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-12-18T03:38:56.9333426Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-12-18T03:38:56.9385347Z Entering 'third_party/fbgemm/third_party/googletest' 2024-12-18T03:38:56.9427538Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-12-18T03:38:56.9475278Z Entering 'third_party/flatbuffers' 2024-12-18T03:38:56.9520611Z Entering 'third_party/fmt' 2024-12-18T03:38:56.9561312Z Entering 'third_party/gemmlowp/gemmlowp' 2024-12-18T03:38:56.9608006Z Entering 'third_party/gloo' 2024-12-18T03:38:56.9649555Z Entering 'third_party/googletest' 2024-12-18T03:38:56.9703272Z Entering 'third_party/ideep' 2024-12-18T03:38:56.9745178Z Entering 'third_party/ideep/mkl-dnn' 2024-12-18T03:38:56.9806202Z Entering 'third_party/ittapi' 2024-12-18T03:38:56.9848957Z Entering 'third_party/kineto' 2024-12-18T03:38:56.9897368Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-12-18T03:38:56.9933756Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-12-18T03:38:56.9985112Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-12-18T03:38:57.0025404Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-12-18T03:38:57.0069477Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-12-18T03:38:57.0111377Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-12-18T03:38:57.0166814Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-12-18T03:38:57.0205730Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-12-18T03:38:57.0242415Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-12-18T03:38:57.0293557Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-12-18T03:38:57.0347918Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-12-18T03:38:57.0386273Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-12-18T03:38:57.0425792Z Entering 'third_party/mimalloc' 2024-12-18T03:38:57.0477431Z Entering 'third_party/nccl/nccl' 2024-12-18T03:38:57.0518240Z Entering 'third_party/nlohmann' 2024-12-18T03:38:57.0568515Z Entering 'third_party/onnx' 2024-12-18T03:38:57.0632644Z Entering 'third_party/onnx/third_party/pybind11' 2024-12-18T03:38:57.0688795Z Entering 'third_party/opentelemetry-cpp' 2024-12-18T03:38:57.0729606Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-12-18T03:38:57.0773818Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-12-18T03:38:57.0813001Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-12-18T03:38:57.0852863Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-12-18T03:38:57.0896824Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-12-18T03:38:57.0938681Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-12-18T03:38:57.0980118Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-12-18T03:38:57.1020412Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-12-18T03:38:57.1063606Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-12-18T03:38:57.1108220Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-12-18T03:38:57.1173754Z Entering 'third_party/pocketfft' 2024-12-18T03:38:57.1221006Z Entering 'third_party/protobuf' 2024-12-18T03:38:57.1263280Z Entering 'third_party/protobuf/third_party/benchmark' 2024-12-18T03:38:57.1302940Z Entering 'third_party/protobuf/third_party/googletest' 2024-12-18T03:38:57.1345441Z Entering 'third_party/psimd' 2024-12-18T03:38:57.1392339Z Entering 'third_party/pthreadpool' 2024-12-18T03:38:57.1435019Z Entering 'third_party/pybind11' 2024-12-18T03:38:57.1474220Z Entering 'third_party/python-peachpy' 2024-12-18T03:38:57.1514449Z Entering 'third_party/sleef' 2024-12-18T03:38:57.1554355Z Entering 'third_party/tensorpipe' 2024-12-18T03:38:57.1595536Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-12-18T03:38:57.1634810Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-12-18T03:38:57.1672509Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-12-18T03:38:57.1711854Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-12-18T03:38:57.1754744Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-12-18T03:38:57.1824332Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2024-12-18T03:38:57.2085776Z Entering 'android/libs/fbjni' 2024-12-18T03:38:57.2132197Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2024-12-18T03:38:57.2156454Z Entering 'third_party/FP16' 2024-12-18T03:38:57.2192803Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2024-12-18T03:38:57.2219534Z Entering 'third_party/FXdiv' 2024-12-18T03:38:57.2257885Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2024-12-18T03:38:57.2278703Z Entering 'third_party/NNPACK' 2024-12-18T03:38:57.2310802Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2024-12-18T03:38:57.2328270Z Entering 'third_party/NVTX' 2024-12-18T03:38:57.2368461Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2024-12-18T03:38:57.2392650Z Entering 'third_party/VulkanMemoryAllocator' 2024-12-18T03:38:57.2432026Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2024-12-18T03:38:57.2449808Z Entering 'third_party/XNNPACK' 2024-12-18T03:38:57.2484781Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2024-12-18T03:38:57.2515355Z Entering 'third_party/benchmark' 2024-12-18T03:38:57.2557455Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2024-12-18T03:38:57.2579021Z Entering 'third_party/composable_kernel' 2024-12-18T03:38:57.2619546Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2024-12-18T03:38:57.2640830Z Entering 'third_party/cpp-httplib' 2024-12-18T03:38:57.2676282Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2024-12-18T03:38:57.2699552Z Entering 'third_party/cpuinfo' 2024-12-18T03:38:57.2735152Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2024-12-18T03:38:57.2753466Z Entering 'third_party/cudnn_frontend' 2024-12-18T03:38:57.2790677Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2024-12-18T03:38:57.2823059Z Entering 'third_party/cutlass' 2024-12-18T03:38:57.2860806Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2024-12-18T03:38:57.2887545Z Entering 'third_party/eigen' 2024-12-18T03:38:57.2926859Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/eigen/config remote.origin.url 2024-12-18T03:38:57.2946562Z Entering 'third_party/fbgemm' 2024-12-18T03:38:57.2986872Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2024-12-18T03:38:57.3006247Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-12-18T03:38:57.3046610Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/asmjit/config remote.origin.url 2024-12-18T03:38:57.3071189Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-12-18T03:38:57.3110244Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/cpuinfo/config remote.origin.url 2024-12-18T03:38:57.3128433Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-12-18T03:38:57.3163499Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/cutlass/config remote.origin.url 2024-12-18T03:38:57.3186146Z Entering 'third_party/fbgemm/third_party/googletest' 2024-12-18T03:38:57.3224145Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/googletest/config remote.origin.url 2024-12-18T03:38:57.3241552Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-12-18T03:38:57.3274472Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/hipify_torch/config remote.origin.url 2024-12-18T03:38:57.3293825Z Entering 'third_party/flatbuffers' 2024-12-18T03:38:57.3332276Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2024-12-18T03:38:57.3353234Z Entering 'third_party/fmt' 2024-12-18T03:38:57.3390883Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2024-12-18T03:38:57.3409341Z Entering 'third_party/gemmlowp/gemmlowp' 2024-12-18T03:38:57.3452040Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2024-12-18T03:38:57.3471985Z Entering 'third_party/gloo' 2024-12-18T03:38:57.3511707Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2024-12-18T03:38:57.3529272Z Entering 'third_party/googletest' 2024-12-18T03:38:57.3569969Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2024-12-18T03:38:57.3597312Z Entering 'third_party/ideep' 2024-12-18T03:38:57.3635843Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2024-12-18T03:38:57.3653544Z Entering 'third_party/ideep/mkl-dnn' 2024-12-18T03:38:57.3699882Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2024-12-18T03:38:57.3725178Z Entering 'third_party/ittapi' 2024-12-18T03:38:57.3763985Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2024-12-18T03:38:57.3782852Z Entering 'third_party/kineto' 2024-12-18T03:38:57.3826572Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2024-12-18T03:38:57.3845556Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-12-18T03:38:57.3886100Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2024-12-18T03:38:57.3907797Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-12-18T03:38:57.3949154Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2024-12-18T03:38:57.3971236Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-12-18T03:38:57.4015101Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2024-12-18T03:38:57.4037172Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-12-18T03:38:57.4073215Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2024-12-18T03:38:57.4096485Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-12-18T03:38:57.4131532Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2024-12-18T03:38:57.4152230Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-12-18T03:38:57.4187371Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2024-12-18T03:38:57.4213160Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-12-18T03:38:57.4249689Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2024-12-18T03:38:57.4264122Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-12-18T03:38:57.4299584Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2024-12-18T03:38:57.4316515Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-12-18T03:38:57.4351121Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2024-12-18T03:38:57.4370856Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-12-18T03:38:57.4419652Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2024-12-18T03:38:57.4439306Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-12-18T03:38:57.4477798Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2024-12-18T03:38:57.4492331Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-12-18T03:38:57.4531365Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2024-12-18T03:38:57.4558676Z Entering 'third_party/mimalloc' 2024-12-18T03:38:57.4596910Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2024-12-18T03:38:57.4619467Z Entering 'third_party/nccl/nccl' 2024-12-18T03:38:57.4659605Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nccl/nccl/config remote.origin.url 2024-12-18T03:38:57.4676555Z Entering 'third_party/nlohmann' 2024-12-18T03:38:57.4713012Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2024-12-18T03:38:57.4732771Z Entering 'third_party/onnx' 2024-12-18T03:38:57.4773649Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2024-12-18T03:38:57.4804295Z Entering 'third_party/onnx/third_party/pybind11' 2024-12-18T03:38:57.4851652Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2024-12-18T03:38:57.4875853Z Entering 'third_party/opentelemetry-cpp' 2024-12-18T03:38:57.4913575Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2024-12-18T03:38:57.4932645Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-12-18T03:38:57.4980108Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2024-12-18T03:38:57.5000573Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-12-18T03:38:57.5045756Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2024-12-18T03:38:57.5074546Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-12-18T03:38:57.5104758Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2024-12-18T03:38:57.5123127Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-12-18T03:38:57.5158785Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2024-12-18T03:38:57.5182040Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-12-18T03:38:57.5219898Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2024-12-18T03:38:57.5237430Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-12-18T03:38:57.5273638Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2024-12-18T03:38:57.5291153Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-12-18T03:38:57.5326819Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2024-12-18T03:38:57.5346822Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-12-18T03:38:57.5392016Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2024-12-18T03:38:57.5411322Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-12-18T03:38:57.5453199Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2024-12-18T03:38:57.5476085Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-12-18T03:38:57.5513500Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2024-12-18T03:38:57.5550494Z Entering 'third_party/pocketfft' 2024-12-18T03:38:57.5588021Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2024-12-18T03:38:57.5610761Z Entering 'third_party/protobuf' 2024-12-18T03:38:57.5651046Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2024-12-18T03:38:57.5672365Z Entering 'third_party/protobuf/third_party/benchmark' 2024-12-18T03:38:57.5714127Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2024-12-18T03:38:57.5732991Z Entering 'third_party/protobuf/third_party/googletest' 2024-12-18T03:38:57.5777089Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2024-12-18T03:38:57.5798395Z Entering 'third_party/psimd' 2024-12-18T03:38:57.5833357Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2024-12-18T03:38:57.5852181Z Entering 'third_party/pthreadpool' 2024-12-18T03:38:57.5890563Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2024-12-18T03:38:57.5909315Z Entering 'third_party/pybind11' 2024-12-18T03:38:57.5950044Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2024-12-18T03:38:57.5967782Z Entering 'third_party/python-peachpy' 2024-12-18T03:38:57.6008389Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2024-12-18T03:38:57.6031293Z Entering 'third_party/sleef' 2024-12-18T03:38:57.6068279Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2024-12-18T03:38:57.6089748Z Entering 'third_party/tensorpipe' 2024-12-18T03:38:57.6128632Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2024-12-18T03:38:57.6150011Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-12-18T03:38:57.6183815Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2024-12-18T03:38:57.6202509Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-12-18T03:38:57.6237050Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2024-12-18T03:38:57.6254283Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-12-18T03:38:57.6289215Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2024-12-18T03:38:57.6310945Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-12-18T03:38:57.6361338Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2024-12-18T03:38:57.6379202Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-12-18T03:38:57.6420578Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2024-12-18T03:38:57.6680213Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2024-12-18T03:38:57.6933435Z Entering 'android/libs/fbjni' 2024-12-18T03:38:57.6971843Z Entering 'third_party/FP16' 2024-12-18T03:38:57.7007506Z Entering 'third_party/FXdiv' 2024-12-18T03:38:57.7048728Z Entering 'third_party/NNPACK' 2024-12-18T03:38:57.7090368Z Entering 'third_party/NVTX' 2024-12-18T03:38:57.7128379Z Entering 'third_party/VulkanMemoryAllocator' 2024-12-18T03:38:57.7325266Z Entering 'third_party/XNNPACK' 2024-12-18T03:38:57.7325598Z Entering 'third_party/benchmark' 2024-12-18T03:38:57.7325890Z Entering 'third_party/composable_kernel' 2024-12-18T03:38:57.7326199Z Entering 'third_party/cpp-httplib' 2024-12-18T03:38:57.7338060Z Entering 'third_party/cpuinfo' 2024-12-18T03:38:57.7379586Z Entering 'third_party/cudnn_frontend' 2024-12-18T03:38:57.7413480Z Entering 'third_party/cutlass' 2024-12-18T03:38:57.7466824Z Entering 'third_party/eigen' 2024-12-18T03:38:57.7511759Z Entering 'third_party/fbgemm' 2024-12-18T03:38:57.7551879Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-12-18T03:38:57.7592802Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-12-18T03:38:57.7636513Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-12-18T03:38:57.7683226Z Entering 'third_party/fbgemm/third_party/googletest' 2024-12-18T03:38:57.7725740Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-12-18T03:38:57.7763258Z Entering 'third_party/flatbuffers' 2024-12-18T03:38:57.7823153Z Entering 'third_party/fmt' 2024-12-18T03:38:57.7859135Z Entering 'third_party/gemmlowp/gemmlowp' 2024-12-18T03:38:57.7899179Z Entering 'third_party/gloo' 2024-12-18T03:38:57.7940304Z Entering 'third_party/googletest' 2024-12-18T03:38:57.7980503Z Entering 'third_party/ideep' 2024-12-18T03:38:57.8019564Z Entering 'third_party/ideep/mkl-dnn' 2024-12-18T03:38:57.8067971Z Entering 'third_party/ittapi' 2024-12-18T03:38:57.8105030Z Entering 'third_party/kineto' 2024-12-18T03:38:57.8141144Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-12-18T03:38:57.8185440Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-12-18T03:38:57.8222673Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-12-18T03:38:57.8259655Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-12-18T03:38:57.8295135Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-12-18T03:38:57.8339597Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-12-18T03:38:57.8388689Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-12-18T03:38:57.8428408Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-12-18T03:38:57.8475631Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-12-18T03:38:57.8521136Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-12-18T03:38:57.8561101Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-12-18T03:38:57.8597834Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-12-18T03:38:57.8640740Z Entering 'third_party/mimalloc' 2024-12-18T03:38:57.8678370Z Entering 'third_party/nccl/nccl' 2024-12-18T03:38:57.8718065Z Entering 'third_party/nlohmann' 2024-12-18T03:38:57.8755437Z Entering 'third_party/onnx' 2024-12-18T03:38:57.8806880Z Entering 'third_party/onnx/third_party/pybind11' 2024-12-18T03:38:57.8853112Z Entering 'third_party/opentelemetry-cpp' 2024-12-18T03:38:57.8891424Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-12-18T03:38:57.8929112Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-12-18T03:38:57.8966769Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-12-18T03:38:57.9004779Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-12-18T03:38:57.9039186Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-12-18T03:38:57.9075420Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-12-18T03:38:57.9108583Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-12-18T03:38:57.9147272Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-12-18T03:38:57.9195384Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-12-18T03:38:57.9236918Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-12-18T03:38:57.9294877Z Entering 'third_party/pocketfft' 2024-12-18T03:38:57.9339491Z Entering 'third_party/protobuf' 2024-12-18T03:38:57.9382206Z Entering 'third_party/protobuf/third_party/benchmark' 2024-12-18T03:38:57.9428614Z Entering 'third_party/protobuf/third_party/googletest' 2024-12-18T03:38:57.9472744Z Entering 'third_party/psimd' 2024-12-18T03:38:57.9512030Z Entering 'third_party/pthreadpool' 2024-12-18T03:38:57.9564816Z Entering 'third_party/pybind11' 2024-12-18T03:38:57.9600351Z Entering 'third_party/python-peachpy' 2024-12-18T03:38:57.9639269Z Entering 'third_party/sleef' 2024-12-18T03:38:57.9676449Z Entering 'third_party/tensorpipe' 2024-12-18T03:38:57.9711612Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-12-18T03:38:57.9750478Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-12-18T03:38:57.9788524Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-12-18T03:38:57.9827400Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-12-18T03:38:57.9869234Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-12-18T03:38:57.9939451Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2024-12-18T03:38:58.0194336Z Entering 'android/libs/fbjni' 2024-12-18T03:38:58.0230376Z Entering 'third_party/FP16' 2024-12-18T03:38:58.0267615Z Entering 'third_party/FXdiv' 2024-12-18T03:38:58.0311726Z Entering 'third_party/NNPACK' 2024-12-18T03:38:58.0347993Z Entering 'third_party/NVTX' 2024-12-18T03:38:58.0383924Z Entering 'third_party/VulkanMemoryAllocator' 2024-12-18T03:38:58.0420518Z Entering 'third_party/XNNPACK' 2024-12-18T03:38:58.0475156Z Entering 'third_party/benchmark' 2024-12-18T03:38:58.0511467Z Entering 'third_party/composable_kernel' 2024-12-18T03:38:58.0553439Z Entering 'third_party/cpp-httplib' 2024-12-18T03:38:58.0592811Z Entering 'third_party/cpuinfo' 2024-12-18T03:38:58.0632393Z Entering 'third_party/cudnn_frontend' 2024-12-18T03:38:58.0667815Z Entering 'third_party/cutlass' 2024-12-18T03:38:58.0712473Z Entering 'third_party/eigen' 2024-12-18T03:38:58.0754580Z Entering 'third_party/fbgemm' 2024-12-18T03:38:58.0789570Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-12-18T03:38:58.0826700Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-12-18T03:38:58.0865663Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-12-18T03:38:58.0904170Z Entering 'third_party/fbgemm/third_party/googletest' 2024-12-18T03:38:58.0939394Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-12-18T03:38:58.0978944Z Entering 'third_party/flatbuffers' 2024-12-18T03:38:58.1025442Z Entering 'third_party/fmt' 2024-12-18T03:38:58.1066437Z Entering 'third_party/gemmlowp/gemmlowp' 2024-12-18T03:38:58.1103577Z Entering 'third_party/gloo' 2024-12-18T03:38:58.1138223Z Entering 'third_party/googletest' 2024-12-18T03:38:58.1172901Z Entering 'third_party/ideep' 2024-12-18T03:38:58.1210620Z Entering 'third_party/ideep/mkl-dnn' 2024-12-18T03:38:58.1259826Z Entering 'third_party/ittapi' 2024-12-18T03:38:58.1302706Z Entering 'third_party/kineto' 2024-12-18T03:38:58.1340445Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-12-18T03:38:58.1379865Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-12-18T03:38:58.1420841Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-12-18T03:38:58.1454560Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-12-18T03:38:58.1488360Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-12-18T03:38:58.1529119Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-12-18T03:38:58.1574490Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-12-18T03:38:58.1610730Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-12-18T03:38:58.1651805Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-12-18T03:38:58.1691793Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-12-18T03:38:58.1732835Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-12-18T03:38:58.1772009Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-12-18T03:38:58.1819498Z Entering 'third_party/mimalloc' 2024-12-18T03:38:58.1859476Z Entering 'third_party/nccl/nccl' 2024-12-18T03:38:58.1899797Z Entering 'third_party/nlohmann' 2024-12-18T03:38:58.1941181Z Entering 'third_party/onnx' 2024-12-18T03:38:58.1991923Z Entering 'third_party/onnx/third_party/pybind11' 2024-12-18T03:38:58.2038204Z Entering 'third_party/opentelemetry-cpp' 2024-12-18T03:38:58.2077990Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-12-18T03:38:58.2113617Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-12-18T03:38:58.2152286Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-12-18T03:38:58.2190132Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-12-18T03:38:58.2224276Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-12-18T03:38:58.2262163Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-12-18T03:38:58.2298963Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-12-18T03:38:58.2332040Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-12-18T03:38:58.2372567Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-12-18T03:38:58.2415244Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-12-18T03:38:58.2473840Z Entering 'third_party/pocketfft' 2024-12-18T03:38:58.2512995Z Entering 'third_party/protobuf' 2024-12-18T03:38:58.2552518Z Entering 'third_party/protobuf/third_party/benchmark' 2024-12-18T03:38:58.2590114Z Entering 'third_party/protobuf/third_party/googletest' 2024-12-18T03:38:58.2635031Z Entering 'third_party/psimd' 2024-12-18T03:38:58.2670181Z Entering 'third_party/pthreadpool' 2024-12-18T03:38:58.2714947Z Entering 'third_party/pybind11' 2024-12-18T03:38:58.2751728Z Entering 'third_party/python-peachpy' 2024-12-18T03:38:58.2792157Z Entering 'third_party/sleef' 2024-12-18T03:38:58.2832693Z Entering 'third_party/tensorpipe' 2024-12-18T03:38:58.2872509Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-12-18T03:38:58.2913060Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-12-18T03:38:58.2950413Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-12-18T03:38:58.2983641Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-12-18T03:38:58.3034926Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-12-18T03:38:58.3102829Z ##[endgroup] 2024-12-18T03:38:58.3155524Z [command]/usr/bin/git log -1 --format='%H' 2024-12-18T03:38:58.3183517Z '0cdf8b1d09254cfda66191d1bd01e3041c3c76f7' 2024-12-18T03:38:58.3383000Z Prepare all required actions 2024-12-18T03:38:58.3383857Z Getting action download info 2024-12-18T03:38:58.3456726Z ##[group]Run ./.github/actions/setup-rocm 2024-12-18T03:38:58.3457172Z env: 2024-12-18T03:38:58.3457460Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:38:58.3457826Z ##[endgroup] 2024-12-18T03:38:58.3483302Z ##[group]Run echo "DOCKER_HOST=unix:///run/user/$(id -u)/docker.sock" >> "${GITHUB_ENV}" 2024-12-18T03:38:58.3484161Z echo "DOCKER_HOST=unix:///run/user/$(id -u)/docker.sock" >> "${GITHUB_ENV}" 2024-12-18T03:38:58.3526467Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:38:58.3527005Z env: 2024-12-18T03:38:58.3527303Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:38:58.3527652Z ##[endgroup] 2024-12-18T03:38:58.3644379Z ##[group]Run set -ex 2024-12-18T03:38:58.3644619Z set -ex 2024-12-18T03:38:58.3644820Z  2024-12-18T03:38:58.3645040Z cat ~/.docker/config.json || true 2024-12-18T03:38:58.3645788Z # https://stackoverflow.com/questions/64455468/error-when-logging-into-ecr-with-docker-login-error-saving-credentials-not 2024-12-18T03:38:58.3646547Z rm -f ~/.docker/config.json 2024-12-18T03:38:58.3695807Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:38:58.3696464Z env: 2024-12-18T03:38:58.3696842Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:38:58.3697368Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:38:58.3697955Z ##[endgroup] 2024-12-18T03:38:58.3758071Z + cat /home/pytorchci/.docker/config.json 2024-12-18T03:38:58.3765298Z { 2024-12-18T03:38:58.3765631Z "auths": {} 2024-12-18T03:38:58.3766614Z + rm -f /home/pytorchci/.docker/config.json 2024-12-18T03:38:58.3782939Z } 2024-12-18T03:38:58.3816161Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2024-12-18T03:38:58.3816974Z # ignore expansion of "docker ps -q" since it could be empty 2024-12-18T03:38:58.3817595Z # shellcheck disable=SC2046 2024-12-18T03:38:58.3818104Z docker stop $(docker ps -q) || true 2024-12-18T03:38:58.3818997Z # Prune all stopped containers. 2024-12-18T03:38:58.3819463Z docker container prune -f 2024-12-18T03:38:58.3860080Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:38:58.3860608Z env: 2024-12-18T03:38:58.3860907Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:38:58.3861325Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:38:58.3861780Z ##[endgroup] 2024-12-18T03:38:58.4233394Z "docker stop" requires at least 1 argument. 2024-12-18T03:38:58.4233948Z See 'docker stop --help'. 2024-12-18T03:38:58.4234198Z 2024-12-18T03:38:58.4234440Z Usage: docker stop [OPTIONS] CONTAINER [CONTAINER...] 2024-12-18T03:38:58.4234812Z 2024-12-18T03:38:58.4234974Z Stop one or more running containers 2024-12-18T03:38:58.4394660Z Total reclaimed space: 0B 2024-12-18T03:38:58.4437603Z ##[group]Run cat /etc/os-release || true 2024-12-18T03:38:58.4437869Z cat /etc/os-release || true 2024-12-18T03:38:58.4438135Z cat /etc/apt/sources.list.d/rocm.list || true 2024-12-18T03:38:58.4438416Z cat /opt/rocm/.info/version || true 2024-12-18T03:38:58.4438638Z whoami 2024-12-18T03:38:58.4460069Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:38:58.4460397Z env: 2024-12-18T03:38:58.4460579Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:38:58.4460843Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:38:58.4461142Z ##[endgroup] 2024-12-18T03:38:58.4504938Z PRETTY_NAME="Ubuntu 22.04.3 LTS" 2024-12-18T03:38:58.4505235Z NAME="Ubuntu" 2024-12-18T03:38:58.4505439Z VERSION_ID="22.04" 2024-12-18T03:38:58.4505675Z VERSION="22.04.3 LTS (Jammy Jellyfish)" 2024-12-18T03:38:58.4505948Z VERSION_CODENAME=jammy 2024-12-18T03:38:58.4506167Z ID=ubuntu 2024-12-18T03:38:58.4506355Z ID_LIKE=debian 2024-12-18T03:38:58.4506584Z HOME_URL="https://www.ubuntu.com/" 2024-12-18T03:38:58.4506888Z SUPPORT_URL="https://help.ubuntu.com/" 2024-12-18T03:38:58.4507545Z BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" 2024-12-18T03:38:58.4508067Z PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" 2024-12-18T03:38:58.4508519Z UBUNTU_CODENAME=jammy 2024-12-18T03:38:58.4516866Z deb [arch=amd64] https://repo.radeon.com/rocm/apt/6.2.1 jammy main 2024-12-18T03:38:58.4529862Z 6.2.1-112 2024-12-18T03:38:58.4542751Z pytorchci 2024-12-18T03:38:58.4565422Z ##[group]Run rocm-smi 2024-12-18T03:38:58.4565767Z rocm-smi 2024-12-18T03:38:58.4601049Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:38:58.4601575Z env: 2024-12-18T03:38:58.4601862Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:38:58.4602289Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:38:58.4602747Z ##[endgroup] 2024-12-18T03:38:58.5427437Z 2024-12-18T03:38:58.5427542Z 2024-12-18T03:38:58.5427836Z ========================================= ROCm System Management Interface ========================================= 2024-12-18T03:38:58.5428371Z =================================================== Concise Info =================================================== 2024-12-18T03:38:58.5428946Z Device Node IDs Temp Power Partitions SCLK MCLK Fan Perf PwrCap VRAM% GPU% 2024-12-18T03:38:58.5429775Z  (DID, GUID) (Edge) (Avg) (Mem, Compute, ID)  2024-12-18T03:38:58.5430202Z ==================================================================================================================== 2024-12-18T03:38:58.5431362Z 0 2 0x740f, 12261 49.0°C 40.0W N/A, N/A, 0 800Mhz 1600Mhz 0% auto 300.0W 0% 0% 2024-12-18T03:38:58.5432521Z 1 3 0x740f, 36740 44.0°C 43.0W N/A, N/A, 0 800Mhz 1600Mhz 0% auto 300.0W 0% 0% 2024-12-18T03:38:58.5433307Z ==================================================================================================================== 2024-12-18T03:38:58.5434035Z =============================================== End of ROCm SMI Log ================================================ 2024-12-18T03:38:58.5525793Z ##[group]Run rocminfo 2024-12-18T03:38:58.5526163Z rocminfo 2024-12-18T03:38:58.5567854Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:38:58.5568389Z env: 2024-12-18T03:38:58.5568679Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:38:58.5569098Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:38:58.5569550Z ##[endgroup] 2024-12-18T03:38:58.6304603Z ROCk module version 6.8.5 is loaded 2024-12-18T03:38:58.6304900Z ===================== 2024-12-18T03:38:58.6305095Z HSA System Attributes 2024-12-18T03:38:58.6305289Z ===================== 2024-12-18T03:38:58.6305471Z Runtime Version: 1.14 2024-12-18T03:38:58.6305692Z Runtime Ext Version: 1.6 2024-12-18T03:38:58.6305899Z System Timestamp Freq.: 1000.000000MHz 2024-12-18T03:38:58.6306270Z Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) 2024-12-18T03:38:58.6306649Z Machine Model: LARGE 2024-12-18T03:38:58.6306997Z System Endianness: LITTLE 2024-12-18T03:38:58.6307273Z Mwaitx: DISABLED 2024-12-18T03:38:58.6307474Z DMAbuf Support: YES 2024-12-18T03:38:58.6307606Z 2024-12-18T03:38:58.6307671Z ========== 2024-12-18T03:38:58.6307853Z HSA Agents 2024-12-18T03:38:58.6308018Z ========== 2024-12-18T03:38:58.6308183Z ******* 2024-12-18T03:38:58.6308354Z Agent 1 2024-12-18T03:38:58.6308516Z ******* 2024-12-18T03:38:58.6308732Z Name: AMD EPYC 7513 32-Core Processor 2024-12-18T03:38:58.6309030Z Uuid: CPU-XX 2024-12-18T03:38:58.6309308Z Marketing Name: AMD EPYC 7513 32-Core Processor 2024-12-18T03:38:58.6309855Z Vendor Name: CPU 2024-12-18T03:38:58.6310159Z Feature: None specified 2024-12-18T03:38:58.6310463Z Profile: FULL_PROFILE 2024-12-18T03:38:58.6310744Z Float Round Mode: NEAR 2024-12-18T03:38:58.6311155Z Max Queue Number: 0(0x0) 2024-12-18T03:38:58.6311430Z Queue Min Size: 0(0x0) 2024-12-18T03:38:58.6311707Z Queue Max Size: 0(0x0) 2024-12-18T03:38:58.6311981Z Queue Type: MULTI 2024-12-18T03:38:58.6312236Z Node: 0 2024-12-18T03:38:58.6312497Z Device Type: CPU 2024-12-18T03:38:58.6312741Z Cache Info: 2024-12-18T03:38:58.6312948Z L1: 32768(0x8000) KB 2024-12-18T03:38:58.6313214Z Chip ID: 0(0x0) 2024-12-18T03:38:58.6313493Z ASIC Revision: 0(0x0) 2024-12-18T03:38:58.6313781Z Cacheline Size: 64(0x40) 2024-12-18T03:38:58.6314058Z Max Clock Freq. (MHz): 2600 2024-12-18T03:38:58.6314319Z BDFID: 0 2024-12-18T03:38:58.6314590Z Internal Node ID: 0 2024-12-18T03:38:58.6314868Z Compute Unit: 32 2024-12-18T03:38:58.6315140Z SIMDs per CU: 0 2024-12-18T03:38:58.6315412Z Shader Engines: 0 2024-12-18T03:38:58.6315695Z Shader Arrs. per Eng.: 0 2024-12-18T03:38:58.6315990Z WatchPts on Addr. Ranges:1 2024-12-18T03:38:58.6316263Z Memory Properties: 2024-12-18T03:38:58.6316460Z Features: None 2024-12-18T03:38:58.6316820Z Pool Info: 2024-12-18T03:38:58.6317009Z Pool 1 2024-12-18T03:38:58.6317241Z Segment: GLOBAL; FLAGS: FINE GRAINED 2024-12-18T03:38:58.6317525Z Size: 65839404(0x3eca12c) KB 2024-12-18T03:38:58.6317807Z Allocatable: TRUE 2024-12-18T03:38:58.6318091Z Alloc Granule: 4KB 2024-12-18T03:38:58.6318457Z Alloc Recommended Granule:4KB 2024-12-18T03:38:58.6318765Z Alloc Alignment: 4KB 2024-12-18T03:38:58.6319056Z Accessible by all: TRUE 2024-12-18T03:38:58.6319311Z Pool 2 2024-12-18T03:38:58.6319551Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2024-12-18T03:38:58.6319821Z Size: 65839404(0x3eca12c) KB 2024-12-18T03:38:58.6320106Z Allocatable: TRUE 2024-12-18T03:38:58.6320390Z Alloc Granule: 4KB 2024-12-18T03:38:58.6320684Z Alloc Recommended Granule:4KB 2024-12-18T03:38:58.6320990Z Alloc Alignment: 4KB 2024-12-18T03:38:58.6321278Z Accessible by all: TRUE 2024-12-18T03:38:58.6321522Z Pool 3 2024-12-18T03:38:58.6321750Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2024-12-18T03:38:58.6322019Z Size: 65839404(0x3eca12c) KB 2024-12-18T03:38:58.6322287Z Allocatable: TRUE 2024-12-18T03:38:58.6322681Z Alloc Granule: 4KB 2024-12-18T03:38:58.6322982Z Alloc Recommended Granule:4KB 2024-12-18T03:38:58.6323284Z Alloc Alignment: 4KB 2024-12-18T03:38:58.6323570Z Accessible by all: TRUE 2024-12-18T03:38:58.6323823Z ISA Info: 2024-12-18T03:38:58.6324000Z ******* 2024-12-18T03:38:58.6324169Z Agent 2 2024-12-18T03:38:58.6324361Z ******* 2024-12-18T03:38:58.6324564Z Name: AMD EPYC 7513 32-Core Processor 2024-12-18T03:38:58.6324832Z Uuid: CPU-XX 2024-12-18T03:38:58.6325122Z Marketing Name: AMD EPYC 7513 32-Core Processor 2024-12-18T03:38:58.6325412Z Vendor Name: CPU 2024-12-18T03:38:58.6325693Z Feature: None specified 2024-12-18T03:38:58.6325983Z Profile: FULL_PROFILE 2024-12-18T03:38:58.6326263Z Float Round Mode: NEAR 2024-12-18T03:38:58.6326552Z Max Queue Number: 0(0x0) 2024-12-18T03:38:58.6326834Z Queue Min Size: 0(0x0) 2024-12-18T03:38:58.6327101Z Queue Max Size: 0(0x0) 2024-12-18T03:38:58.6327375Z Queue Type: MULTI 2024-12-18T03:38:58.6327630Z Node: 1 2024-12-18T03:38:58.6327883Z Device Type: CPU 2024-12-18T03:38:58.6328132Z Cache Info: 2024-12-18T03:38:58.6328338Z L1: 32768(0x8000) KB 2024-12-18T03:38:58.6328591Z Chip ID: 0(0x0) 2024-12-18T03:38:58.6328867Z ASIC Revision: 0(0x0) 2024-12-18T03:38:58.6329148Z Cacheline Size: 64(0x40) 2024-12-18T03:38:58.6329533Z Max Clock Freq. (MHz): 2600 2024-12-18T03:38:58.6329794Z BDFID: 0 2024-12-18T03:38:58.6330070Z Internal Node ID: 1 2024-12-18T03:38:58.6330390Z Compute Unit: 32 2024-12-18T03:38:58.6330706Z SIMDs per CU: 0 2024-12-18T03:38:58.6331029Z Shader Engines: 0 2024-12-18T03:38:58.6331359Z Shader Arrs. per Eng.: 0 2024-12-18T03:38:58.6331713Z WatchPts on Addr. Ranges:1 2024-12-18T03:38:58.6331986Z Memory Properties: 2024-12-18T03:38:58.6332176Z Features: None 2024-12-18T03:38:58.6332367Z Pool Info: 2024-12-18T03:38:58.6332557Z Pool 1 2024-12-18T03:38:58.6332794Z Segment: GLOBAL; FLAGS: FINE GRAINED 2024-12-18T03:38:58.6333080Z Size: 65997864(0x3ef0c28) KB 2024-12-18T03:38:58.6333353Z Allocatable: TRUE 2024-12-18T03:38:58.6333641Z Alloc Granule: 4KB 2024-12-18T03:38:58.6333938Z Alloc Recommended Granule:4KB 2024-12-18T03:38:58.6334246Z Alloc Alignment: 4KB 2024-12-18T03:38:58.6334615Z Accessible by all: TRUE 2024-12-18T03:38:58.6334865Z Pool 2 2024-12-18T03:38:58.6335104Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2024-12-18T03:38:58.6335380Z Size: 65997864(0x3ef0c28) KB 2024-12-18T03:38:58.6335776Z Allocatable: TRUE 2024-12-18T03:38:58.6336069Z Alloc Granule: 4KB 2024-12-18T03:38:58.6336376Z Alloc Recommended Granule:4KB 2024-12-18T03:38:58.6336669Z Alloc Alignment: 4KB 2024-12-18T03:38:58.6336962Z Accessible by all: TRUE 2024-12-18T03:38:58.6337216Z Pool 3 2024-12-18T03:38:58.6337436Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2024-12-18T03:38:58.6337708Z Size: 65997864(0x3ef0c28) KB 2024-12-18T03:38:58.6337982Z Allocatable: TRUE 2024-12-18T03:38:58.6338259Z Alloc Granule: 4KB 2024-12-18T03:38:58.6338556Z Alloc Recommended Granule:4KB 2024-12-18T03:38:58.6338864Z Alloc Alignment: 4KB 2024-12-18T03:38:58.6339152Z Accessible by all: TRUE 2024-12-18T03:38:58.6339410Z ISA Info: 2024-12-18T03:38:58.6339594Z ******* 2024-12-18T03:38:58.6339815Z Agent 3 2024-12-18T03:38:58.6339992Z ******* 2024-12-18T03:38:58.6340188Z Name: gfx90a 2024-12-18T03:38:58.6340443Z Uuid: GPU-48775b8a453f84af 2024-12-18T03:38:58.6340725Z Marketing Name: AMD Instinct MI210 2024-12-18T03:38:58.6341008Z Vendor Name: AMD 2024-12-18T03:38:58.6341279Z Feature: KERNEL_DISPATCH 2024-12-18T03:38:58.6341553Z Profile: BASE_PROFILE 2024-12-18T03:38:58.6341831Z Float Round Mode: NEAR 2024-12-18T03:38:58.6342114Z Max Queue Number: 128(0x80) 2024-12-18T03:38:58.6344817Z Queue Min Size: 64(0x40) 2024-12-18T03:38:58.6345089Z Queue Max Size: 131072(0x20000) 2024-12-18T03:38:58.6345352Z Queue Type: MULTI 2024-12-18T03:38:58.6345605Z Node: 2 2024-12-18T03:38:58.6345860Z Device Type: GPU 2024-12-18T03:38:58.6346102Z Cache Info: 2024-12-18T03:38:58.6346301Z L1: 16(0x10) KB 2024-12-18T03:38:58.6346539Z L2: 8192(0x2000) KB 2024-12-18T03:38:58.6346782Z Chip ID: 29711(0x740f) 2024-12-18T03:38:58.6347051Z ASIC Revision: 1(0x1) 2024-12-18T03:38:58.6347340Z Cacheline Size: 64(0x40) 2024-12-18T03:38:58.6347622Z Max Clock Freq. (MHz): 1700 2024-12-18T03:38:58.6347884Z BDFID: 768 2024-12-18T03:38:58.6348146Z Internal Node ID: 2 2024-12-18T03:38:58.6348423Z Compute Unit: 104 2024-12-18T03:38:58.6348686Z SIMDs per CU: 4 2024-12-18T03:38:58.6348961Z Shader Engines: 8 2024-12-18T03:38:58.6349248Z Shader Arrs. per Eng.: 1 2024-12-18T03:38:58.6349533Z WatchPts on Addr. Ranges:4 2024-12-18T03:38:58.6349830Z Coherent Host Access: FALSE 2024-12-18T03:38:58.6350108Z Memory Properties: 2024-12-18T03:38:58.6350346Z Features: KERNEL_DISPATCH 2024-12-18T03:38:58.6350786Z Fast F16 Operation: TRUE 2024-12-18T03:38:58.6351142Z Wavefront Size: 64(0x40) 2024-12-18T03:38:58.6351478Z Workgroup Max Size: 1024(0x400) 2024-12-18T03:38:58.6351791Z Workgroup Max Size per Dimension: 2024-12-18T03:38:58.6352017Z x 1024(0x400) 2024-12-18T03:38:58.6352246Z y 1024(0x400) 2024-12-18T03:38:58.6352470Z z 1024(0x400) 2024-12-18T03:38:58.6352725Z Max Waves Per CU: 32(0x20) 2024-12-18T03:38:58.6353003Z Max Work-item Per CU: 2048(0x800) 2024-12-18T03:38:58.6353288Z Grid Max Size: 4294967295(0xffffffff) 2024-12-18T03:38:58.6353540Z Grid Max Size per Dimension: 2024-12-18T03:38:58.6353744Z x 4294967295(0xffffffff) 2024-12-18T03:38:58.6353983Z y 4294967295(0xffffffff) 2024-12-18T03:38:58.6354226Z z 4294967295(0xffffffff) 2024-12-18T03:38:58.6354491Z Max fbarriers/Workgrp: 32 2024-12-18T03:38:58.6357428Z Packet Processor uCode:: 83 2024-12-18T03:38:58.6357780Z SDMA engine uCode:: 8 2024-12-18T03:38:58.6358093Z IOMMU Support:: None 2024-12-18T03:38:58.6358358Z Pool Info: 2024-12-18T03:38:58.6358556Z Pool 1 2024-12-18T03:38:58.6358801Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2024-12-18T03:38:58.6359092Z Size: 67092480(0x3ffc000) KB 2024-12-18T03:38:58.6359370Z Allocatable: TRUE 2024-12-18T03:38:58.6359665Z Alloc Granule: 4KB 2024-12-18T03:38:58.6359959Z Alloc Recommended Granule:2048KB 2024-12-18T03:38:58.6360449Z Alloc Alignment: 4KB 2024-12-18T03:38:58.6360743Z Accessible by all: FALSE 2024-12-18T03:38:58.6360991Z Pool 2 2024-12-18T03:38:58.6361234Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2024-12-18T03:38:58.6361533Z Size: 67092480(0x3ffc000) KB 2024-12-18T03:38:58.6361814Z Allocatable: TRUE 2024-12-18T03:38:58.6362102Z Alloc Granule: 4KB 2024-12-18T03:38:58.6362400Z Alloc Recommended Granule:2048KB 2024-12-18T03:38:58.6362761Z Alloc Alignment: 4KB 2024-12-18T03:38:58.6363057Z Accessible by all: FALSE 2024-12-18T03:38:58.6363312Z Pool 3 2024-12-18T03:38:58.6363539Z Segment: GROUP 2024-12-18T03:38:58.6363797Z Size: 64(0x40) KB 2024-12-18T03:38:58.6364061Z Allocatable: FALSE 2024-12-18T03:38:58.6364344Z Alloc Granule: 0KB 2024-12-18T03:38:58.6364643Z Alloc Recommended Granule:0KB 2024-12-18T03:38:58.6364944Z Alloc Alignment: 0KB 2024-12-18T03:38:58.6365231Z Accessible by all: FALSE 2024-12-18T03:38:58.6365485Z ISA Info: 2024-12-18T03:38:58.6365675Z ISA 1 2024-12-18T03:38:58.6365914Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2024-12-18T03:38:58.6366340Z Machine Models: HSA_MACHINE_MODEL_LARGE 2024-12-18T03:38:58.6366652Z Profiles: HSA_PROFILE_BASE 2024-12-18T03:38:58.6366951Z Default Rounding Mode: NEAR 2024-12-18T03:38:58.6367258Z Default Rounding Mode: NEAR 2024-12-18T03:38:58.6367537Z Fast f16: TRUE 2024-12-18T03:38:58.6367820Z Workgroup Max Size: 1024(0x400) 2024-12-18T03:38:58.6368080Z Workgroup Max Size per Dimension: 2024-12-18T03:38:58.6368317Z x 1024(0x400) 2024-12-18T03:38:58.6368558Z y 1024(0x400) 2024-12-18T03:38:58.6368782Z z 1024(0x400) 2024-12-18T03:38:58.6369039Z Grid Max Size: 4294967295(0xffffffff) 2024-12-18T03:38:58.6369298Z Grid Max Size per Dimension: 2024-12-18T03:38:58.6369512Z x 4294967295(0xffffffff) 2024-12-18T03:38:58.6369757Z y 4294967295(0xffffffff) 2024-12-18T03:38:58.6369995Z z 4294967295(0xffffffff) 2024-12-18T03:38:58.6370255Z FBarrier Max Size: 32 2024-12-18T03:38:58.6370504Z ******* 2024-12-18T03:38:58.6370680Z Agent 4 2024-12-18T03:38:58.6370847Z ******* 2024-12-18T03:38:58.6371051Z Name: gfx90a 2024-12-18T03:38:58.6371314Z Uuid: GPU-85feac2c39886449 2024-12-18T03:38:58.6371588Z Marketing Name: AMD Instinct MI210 2024-12-18T03:38:58.6371873Z Vendor Name: AMD 2024-12-18T03:38:58.6372151Z Feature: KERNEL_DISPATCH 2024-12-18T03:38:58.6372421Z Profile: BASE_PROFILE 2024-12-18T03:38:58.6372811Z Float Round Mode: NEAR 2024-12-18T03:38:58.6373114Z Max Queue Number: 128(0x80) 2024-12-18T03:38:58.6373392Z Queue Min Size: 64(0x40) 2024-12-18T03:38:58.6373664Z Queue Max Size: 131072(0x20000) 2024-12-18T03:38:58.6373934Z Queue Type: MULTI 2024-12-18T03:38:58.6374185Z Node: 3 2024-12-18T03:38:58.6374442Z Device Type: GPU 2024-12-18T03:38:58.6374763Z Cache Info: 2024-12-18T03:38:58.6374961Z L1: 16(0x10) KB 2024-12-18T03:38:58.6375204Z L2: 8192(0x2000) KB 2024-12-18T03:38:58.6375453Z Chip ID: 29711(0x740f) 2024-12-18T03:38:58.6375717Z ASIC Revision: 1(0x1) 2024-12-18T03:38:58.6376002Z Cacheline Size: 64(0x40) 2024-12-18T03:38:58.6376282Z Max Clock Freq. (MHz): 1700 2024-12-18T03:38:58.6376538Z BDFID: 33536 2024-12-18T03:38:58.6376803Z Internal Node ID: 3 2024-12-18T03:38:58.6377083Z Compute Unit: 104 2024-12-18T03:38:58.6377346Z SIMDs per CU: 4 2024-12-18T03:38:58.6377626Z Shader Engines: 8 2024-12-18T03:38:58.6377910Z Shader Arrs. per Eng.: 1 2024-12-18T03:38:58.6378199Z WatchPts on Addr. Ranges:4 2024-12-18T03:38:58.6378617Z Coherent Host Access: FALSE 2024-12-18T03:38:58.6378886Z Memory Properties: 2024-12-18T03:38:58.6379095Z Features: KERNEL_DISPATCH 2024-12-18T03:38:58.6379353Z Fast F16 Operation: TRUE 2024-12-18T03:38:58.6379642Z Wavefront Size: 64(0x40) 2024-12-18T03:38:58.6379928Z Workgroup Max Size: 1024(0x400) 2024-12-18T03:38:58.6380185Z Workgroup Max Size per Dimension: 2024-12-18T03:38:58.6380408Z x 1024(0x400) 2024-12-18T03:38:58.6380640Z y 1024(0x400) 2024-12-18T03:38:58.6380871Z z 1024(0x400) 2024-12-18T03:38:58.6381126Z Max Waves Per CU: 32(0x20) 2024-12-18T03:38:58.6381407Z Max Work-item Per CU: 2048(0x800) 2024-12-18T03:38:58.6381687Z Grid Max Size: 4294967295(0xffffffff) 2024-12-18T03:38:58.6381945Z Grid Max Size per Dimension: 2024-12-18T03:38:58.6382152Z x 4294967295(0xffffffff) 2024-12-18T03:38:58.6382389Z y 4294967295(0xffffffff) 2024-12-18T03:38:58.6382617Z z 4294967295(0xffffffff) 2024-12-18T03:38:58.6382888Z Max fbarriers/Workgrp: 32 2024-12-18T03:38:58.6383192Z Packet Processor uCode:: 83 2024-12-18T03:38:58.6383490Z SDMA engine uCode:: 8 2024-12-18T03:38:58.6383777Z IOMMU Support:: None 2024-12-18T03:38:58.6384022Z Pool Info: 2024-12-18T03:38:58.6384211Z Pool 1 2024-12-18T03:38:58.6384446Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2024-12-18T03:38:58.6384725Z Size: 67092480(0x3ffc000) KB 2024-12-18T03:38:58.6385126Z Allocatable: TRUE 2024-12-18T03:38:58.6385409Z Alloc Granule: 4KB 2024-12-18T03:38:58.6385701Z Alloc Recommended Granule:2048KB 2024-12-18T03:38:58.6386006Z Alloc Alignment: 4KB 2024-12-18T03:38:58.6386295Z Accessible by all: FALSE 2024-12-18T03:38:58.6386540Z Pool 2 2024-12-18T03:38:58.6386771Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2024-12-18T03:38:58.6387042Z Size: 67092480(0x3ffc000) KB 2024-12-18T03:38:58.6387304Z Allocatable: TRUE 2024-12-18T03:38:58.6387586Z Alloc Granule: 4KB 2024-12-18T03:38:58.6387900Z Alloc Recommended Granule:2048KB 2024-12-18T03:38:58.6388216Z Alloc Alignment: 4KB 2024-12-18T03:38:58.6388500Z Accessible by all: FALSE 2024-12-18T03:38:58.6388752Z Pool 3 2024-12-18T03:38:58.6388968Z Segment: GROUP 2024-12-18T03:38:58.6389219Z Size: 64(0x40) KB 2024-12-18T03:38:58.6389483Z Allocatable: FALSE 2024-12-18T03:38:58.6389768Z Alloc Granule: 0KB 2024-12-18T03:38:58.6390074Z Alloc Recommended Granule:0KB 2024-12-18T03:38:58.6390430Z Alloc Alignment: 0KB 2024-12-18T03:38:58.6390778Z Accessible by all: FALSE 2024-12-18T03:38:58.6391195Z ISA Info: 2024-12-18T03:38:58.6391411Z ISA 1 2024-12-18T03:38:58.6391696Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2024-12-18T03:38:58.6392057Z Machine Models: HSA_MACHINE_MODEL_LARGE 2024-12-18T03:38:58.6392397Z Profiles: HSA_PROFILE_BASE 2024-12-18T03:38:58.6392694Z Default Rounding Mode: NEAR 2024-12-18T03:38:58.6392990Z Default Rounding Mode: NEAR 2024-12-18T03:38:58.6393270Z Fast f16: TRUE 2024-12-18T03:38:58.6393553Z Workgroup Max Size: 1024(0x400) 2024-12-18T03:38:58.6393819Z Workgroup Max Size per Dimension: 2024-12-18T03:38:58.6394057Z x 1024(0x400) 2024-12-18T03:38:58.6394299Z y 1024(0x400) 2024-12-18T03:38:58.6394532Z z 1024(0x400) 2024-12-18T03:38:58.6394795Z Grid Max Size: 4294967295(0xffffffff) 2024-12-18T03:38:58.6395053Z Grid Max Size per Dimension: 2024-12-18T03:38:58.6395264Z x 4294967295(0xffffffff) 2024-12-18T03:38:58.6395502Z y 4294967295(0xffffffff) 2024-12-18T03:38:58.6395739Z z 4294967295(0xffffffff) 2024-12-18T03:38:58.6396000Z FBarrier Max Size: 32 2024-12-18T03:38:58.6396251Z *** Done *** 2024-12-18T03:38:58.6534456Z ##[group]Run ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2024-12-18T03:38:58.6535396Z ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2024-12-18T03:38:58.6536709Z msg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified" 2024-12-18T03:38:58.6537942Z if [[ $ngpu -eq 0 ]]; then 2024-12-18T03:38:58.6539030Z  echo "Error: Failed to detect any GPUs on the runner" 2024-12-18T03:38:58.6539682Z  echo "$msg" 2024-12-18T03:38:58.6540111Z  exit 1 2024-12-18T03:38:58.6540508Z fi 2024-12-18T03:38:58.6540896Z if [[ $ngpu -eq 1 ]]; then 2024-12-18T03:38:58.6541802Z  echo "Error: only 1 GPU detected, at least 2 GPUs are needed for distributed jobs" 2024-12-18T03:38:58.6542741Z  echo "$msg" 2024-12-18T03:38:58.6543249Z  exit 1 2024-12-18T03:38:58.6543695Z fi 2024-12-18T03:38:58.6589413Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:38:58.6589762Z env: 2024-12-18T03:38:58.6589951Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:38:58.6590235Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:38:58.6590537Z ##[endgroup] 2024-12-18T03:38:58.7668876Z Prepare all required actions 2024-12-18T03:38:58.7715589Z ##[group]Run ./.github/actions/diskspace-cleanup 2024-12-18T03:38:58.7715898Z with: 2024-12-18T03:38:58.7716090Z diskspace-cutoff: 70 2024-12-18T03:38:58.7716303Z env: 2024-12-18T03:38:58.7716490Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:38:58.7716760Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:38:58.7717058Z ##[endgroup] 2024-12-18T03:38:58.7766024Z ##[group]Run set -ex 2024-12-18T03:38:58.7766271Z set -ex 2024-12-18T03:38:58.7766486Z diskspace_cutoff=70 2024-12-18T03:38:58.7766813Z docker_root_dir=$(docker info -f '{{.DockerRootDir}}') 2024-12-18T03:38:58.7767346Z diskspace=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2024-12-18T03:38:58.7768145Z msg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified" 2024-12-18T03:38:58.7769010Z if [[ "$diskspace" -ge "$diskspace_cutoff" ]] ; then 2024-12-18T03:38:58.7769350Z  docker system prune -af 2024-12-18T03:38:58.7769815Z  diskspace_new=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2024-12-18T03:38:58.7770379Z  if [[ "$diskspace_new" -gt "$diskspace_cutoff" ]] ; then 2024-12-18T03:38:58.7771423Z  echo "Error: Available diskspace is less than $diskspace_cutoff percent. Not enough diskspace." 2024-12-18T03:38:58.7772499Z  echo "$msg" 2024-12-18T03:38:58.7773025Z  exit 1 2024-12-18T03:38:58.7773449Z  else 2024-12-18T03:38:58.7773931Z  difference=$((diskspace - diskspace_new)) 2024-12-18T03:38:58.7774753Z  echo "Diskspace saved: $difference percent" 2024-12-18T03:38:58.7775340Z  fi 2024-12-18T03:38:58.7775709Z fi 2024-12-18T03:38:58.7821602Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:38:58.7822255Z env: 2024-12-18T03:38:58.7822605Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:38:58.7823124Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:38:58.7823663Z ##[endgroup] 2024-12-18T03:38:58.7888766Z + diskspace_cutoff=70 2024-12-18T03:38:58.7892626Z ++ docker info -f '{{.DockerRootDir}}' 2024-12-18T03:38:58.8377660Z + docker_root_dir=/home/pytorchci/.local/share/docker 2024-12-18T03:38:58.8384016Z ++ df -H --output=pcent /home/pytorchci/.local/share/docker 2024-12-18T03:38:58.8384320Z ++ sed -n 2p 2024-12-18T03:38:58.8385713Z ++ sed s/%// 2024-12-18T03:38:58.8388553Z ++ sed 's/ //' 2024-12-18T03:38:58.8406801Z + diskspace=55 2024-12-18T03:38:58.8407271Z + msg='Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified' 2024-12-18T03:38:58.8407758Z + [[ 55 -ge 70 ]] 2024-12-18T03:38:58.8457871Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2024-12-18T03:38:58.8458641Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2024-12-18T03:38:58.8459262Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2024-12-18T03:38:58.8501330Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:38:58.8501692Z env: 2024-12-18T03:38:58.8501904Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:38:58.8502191Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:38:58.8502507Z ##[endgroup] 2024-12-18T03:38:58.8624401Z ##[group]Run # All GPUs are visible to the runner; visibility, if needed, will be set by run_test.py. 2024-12-18T03:38:58.8624939Z # All GPUs are visible to the runner; visibility, if needed, will be set by run_test.py. 2024-12-18T03:38:58.8625514Z echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}" 2024-12-18T03:38:58.8647201Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:38:58.8647483Z env: 2024-12-18T03:38:58.8647669Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:38:58.8647894Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:38:58.8648160Z ##[endgroup] 2024-12-18T03:38:58.8737861Z ##[group]Run aws-actions/configure-aws-credentials@v4 2024-12-18T03:38:58.8738153Z with: 2024-12-18T03:38:58.8738424Z role-to-assume: arn:aws:iam::308535385114:role/gha_workflow_s3_and_ecr_read_only 2024-12-18T03:38:58.8738749Z aws-region: us-east-1 2024-12-18T03:38:58.8738942Z role-duration-seconds: 18000 2024-12-18T03:38:58.8739153Z audience: sts.amazonaws.com 2024-12-18T03:38:58.8739350Z env: 2024-12-18T03:38:58.8739496Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:38:58.8739710Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:38:58.8740108Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:38:58.8740466Z ##[endgroup] 2024-12-18T03:38:59.5199350Z Assuming role with OIDC 2024-12-18T03:38:59.8640139Z Authenticated as assumedRoleId AROAUPVRELQNLLCOPFEJR:GitHubActions 2024-12-18T03:38:59.9783795Z ##[group]Run aws-actions/amazon-ecr-login@v2 2024-12-18T03:38:59.9784518Z with: 2024-12-18T03:38:59.9784936Z mask-password: true 2024-12-18T03:38:59.9785426Z registry-type: private 2024-12-18T03:38:59.9785912Z skip-logout: false 2024-12-18T03:38:59.9786332Z env: 2024-12-18T03:38:59.9786706Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:38:59.9787264Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:38:59.9788261Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:38:59.9789194Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:38:59.9789710Z AWS_REGION: us-east-1 2024-12-18T03:38:59.9790327Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:38:59.9791101Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:38:59.9801577Z AWS_SESSION_TOKEN: *** 2024-12-18T03:38:59.9802038Z ##[endgroup] 2024-12-18T03:39:00.6531186Z Logging into registry 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-12-18T03:39:01.4437116Z ##[group]Run pytorch/test-infra/.github/actions/calculate-docker-image@release/2.6 2024-12-18T03:39:01.4437617Z with: 2024-12-18T03:39:01.4438413Z docker-image-name: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-rocm-n-py3:45e1356b47a284893081276eff3000b7b534f3b1 2024-12-18T03:39:01.4439611Z docker-build-dir: .ci/docker 2024-12-18T03:39:01.4440026Z working-directory: . 2024-12-18T03:39:01.4440507Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-12-18T03:39:01.4441050Z force-push: false 2024-12-18T03:39:01.4441372Z env: 2024-12-18T03:39:01.4441669Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:39:01.4442094Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:39:01.4442852Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:39:01.4443557Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:39:01.4443947Z AWS_REGION: us-east-1 2024-12-18T03:39:01.4444576Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:39:01.4445121Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:39:01.4453285Z AWS_SESSION_TOKEN: *** 2024-12-18T03:39:01.4453984Z ##[endgroup] 2024-12-18T03:39:01.4483111Z ##[group]Run set -ex 2024-12-18T03:39:01.4483619Z set -ex 2024-12-18T03:39:01.4483990Z  2024-12-18T03:39:01.4484636Z # If the docker build directory or the build script doesn't exist, the action will 2024-12-18T03:39:01.4485596Z # gracefully return the docker image name as it is. Pulling docker image in Linux 2024-12-18T03:39:01.4486373Z # job could then download the pre-built image as usual 2024-12-18T03:39:01.4487063Z if [[ ! -d "${DOCKER_BUILD_DIR}" ]] || [[ ! -f "${DOCKER_BUILD_DIR}/build.sh" ]]; then 2024-12-18T03:39:01.4487693Z  echo "skip=true" >> "${GITHUB_OUTPUT}" 2024-12-18T03:39:01.4488296Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2024-12-18T03:39:01.4488836Z  2024-12-18T03:39:01.4489323Z  echo "There is no Docker build script in ${REPO_NAME} repo, skipping..." 2024-12-18T03:39:01.4489929Z  exit 0 2024-12-18T03:39:01.4490249Z else 2024-12-18T03:39:01.4490620Z  echo "skip=false" >> "${GITHUB_OUTPUT}" 2024-12-18T03:39:01.4491060Z fi 2024-12-18T03:39:01.4491356Z  2024-12-18T03:39:01.4491826Z if [[ "${DOCKER_IMAGE_NAME}" == *"${DOCKER_REGISTRY}/${REPO_NAME}"* ]]; then 2024-12-18T03:39:01.4492620Z  # The docker image name already includes the ECR prefix and tag, so we can just 2024-12-18T03:39:01.4493322Z  # use it as it is, but first let's extract the tag 2024-12-18T03:39:01.4493966Z  DOCKER_TAG=$(echo "${DOCKER_IMAGE_NAME}" | awk -F '[:,]' '{print $2}') 2024-12-18T03:39:01.4494762Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2024-12-18T03:39:01.4495407Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2024-12-18T03:39:01.4495945Z else 2024-12-18T03:39:01.4496365Z  DOCKER_TAG=$(git rev-parse HEAD:"${DOCKER_BUILD_DIR}") 2024-12-18T03:39:01.4496977Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2024-12-18T03:39:01.4497799Z  echo "docker-image=${DOCKER_REGISTRY}/${REPO_NAME}/${DOCKER_IMAGE_NAME}:${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2024-12-18T03:39:01.4498527Z fi 2024-12-18T03:39:01.4543878Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:39:01.4544513Z env: 2024-12-18T03:39:01.4544886Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:39:01.4545397Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:39:01.4546289Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:39:01.4547132Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:39:01.4547454Z AWS_REGION: us-east-1 2024-12-18T03:39:01.4547830Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:39:01.4548419Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:39:01.4554051Z AWS_SESSION_TOKEN: *** 2024-12-18T03:39:01.4554508Z REPO_NAME: pytorch 2024-12-18T03:39:01.4555588Z DOCKER_IMAGE_NAME: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-rocm-n-py3:45e1356b47a284893081276eff3000b7b534f3b1 2024-12-18T03:39:01.4556586Z DOCKER_BUILD_DIR: .ci/docker 2024-12-18T03:39:01.4557088Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-12-18T03:39:01.4557634Z ##[endgroup] 2024-12-18T03:39:01.4626995Z + [[ ! -d .ci/docker ]] 2024-12-18T03:39:01.4627311Z + [[ ! -f .ci/docker/build.sh ]] 2024-12-18T03:39:01.4627592Z + echo skip=false 2024-12-18T03:39:01.4628479Z + [[ 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-rocm-n-py3:45e1356b47a284893081276eff3000b7b534f3b1 == *\3\0\8\5\3\5\3\8\5\1\1\4\.\d\k\r\.\e\c\r\.\u\s\-\e\a\s\t\-\1\.\a\m\a\z\o\n\a\w\s\.\c\o\m\/\p\y\t\o\r\c\h* ]] 2024-12-18T03:39:01.4640356Z ++ echo 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-rocm-n-py3:45e1356b47a284893081276eff3000b7b534f3b1 2024-12-18T03:39:01.4641784Z ++ awk -F '[:,]' '{print $2}' 2024-12-18T03:39:01.4664422Z + DOCKER_TAG=45e1356b47a284893081276eff3000b7b534f3b1 2024-12-18T03:39:01.4664776Z + echo docker-tag=45e1356b47a284893081276eff3000b7b534f3b1 2024-12-18T03:39:01.4665386Z + echo docker-image=308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-rocm-n-py3:45e1356b47a284893081276eff3000b7b534f3b1 2024-12-18T03:39:01.4708102Z ##[group]Run set +e 2024-12-18T03:39:01.4708398Z set +e 2024-12-18T03:39:01.4708615Z set -x 2024-12-18T03:39:01.4708828Z  2024-12-18T03:39:01.4709034Z login() { 2024-12-18T03:39:01.4709476Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2024-12-18T03:39:01.4709945Z } 2024-12-18T03:39:01.4710148Z  2024-12-18T03:39:01.4710349Z retry () { 2024-12-18T03:39:01.4710615Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2024-12-18T03:39:01.4710917Z } 2024-12-18T03:39:01.4711116Z  2024-12-18T03:39:01.4711343Z retry login "${DOCKER_REGISTRY}" 2024-12-18T03:39:01.4711632Z  2024-12-18T03:39:01.4711843Z START_TIME=$(date +%s) 2024-12-18T03:39:01.4712118Z # Wait up to 90 minutes 2024-12-18T03:39:01.4712457Z while [[ $(( $(date +%s) - 5400 )) -lt $START_TIME ]]; do 2024-12-18T03:39:01.4712961Z  # Check if image already exists, if it does then skip building it 2024-12-18T03:39:01.4713759Z  if docker manifest inspect "${DOCKER_IMAGE}"; then 2024-12-18T03:39:01.4714525Z  exit 0 2024-12-18T03:39:01.4715026Z  fi 2024-12-18T03:39:01.4715492Z  2024-12-18T03:39:01.4716203Z  # NB: This flag is used by Docker build workflow to push the image to ECR, so we can 2024-12-18T03:39:01.4717358Z  # use this to differentiate between the Docker build and regular build jobs. For the 2024-12-18T03:39:01.4718501Z  # latter, it will wait for the Docker images to become available before continuing 2024-12-18T03:39:01.4719419Z  if [ "${DOCKER_PUSH:-false}" == "true" ]; then 2024-12-18T03:39:01.4720138Z  # It's a Docker build job, let's build the image 2024-12-18T03:39:01.4720747Z  break 2024-12-18T03:39:01.4721184Z  else 2024-12-18T03:39:01.4721800Z  # It's a regular build job, wait for the image to become available 2024-12-18T03:39:01.4722536Z  sleep 300 2024-12-18T03:39:01.4722991Z  fi 2024-12-18T03:39:01.4723391Z done 2024-12-18T03:39:01.4723782Z  2024-12-18T03:39:01.4724418Z # NB: This part requires a full checkout. Otherwise, the merge base will 2024-12-18T03:39:01.4725419Z # be empty. The default action would be to continue rebuild the image 2024-12-18T03:39:01.4726767Z if [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then 2024-12-18T03:39:01.4727592Z  # if we're on the base branch then use the parent commit 2024-12-18T03:39:01.4728308Z  MERGE_BASE=$(git rev-parse HEAD~) 2024-12-18T03:39:01.4728884Z else 2024-12-18T03:39:01.4729470Z  # otherwise we're on a PR, so use the most recent base commit 2024-12-18T03:39:01.4730304Z  MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION") 2024-12-18T03:39:01.4730947Z fi 2024-12-18T03:39:01.4731335Z  2024-12-18T03:39:01.4731754Z if [[ -z "${MERGE_BASE}" ]]; then 2024-12-18T03:39:01.4732392Z  echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2024-12-18T03:39:01.4732983Z  2024-12-18T03:39:01.4733794Z  echo "Finding merge base only works with full checkout, please set fetch-depth to 0, continuing ..." 2024-12-18T03:39:01.4734967Z  exit 0 2024-12-18T03:39:01.4735423Z fi 2024-12-18T03:39:01.4735802Z  2024-12-18T03:39:01.4736361Z if ! git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}"; then 2024-12-18T03:39:01.4737572Z  echo "Directory '${DOCKER_BUILD_DIR}' not found in commit $MERGE_BASE, you should rebase onto a more recent commit" 2024-12-18T03:39:01.4739101Z  exit 1 2024-12-18T03:39:01.4739518Z fi 2024-12-18T03:39:01.4739904Z  2024-12-18T03:39:01.4740556Z PREVIOUS_DOCKER_TAG=$(git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}") 2024-12-18T03:39:01.4741786Z # If no image exists but the hash is the same as the previous hash then we should error out here 2024-12-18T03:39:01.4743000Z if [[ "${PREVIOUS_DOCKER_TAG}" == "${DOCKER_TAG}" ]]; then 2024-12-18T03:39:01.4744397Z  echo "WARNING: Something has gone wrong and the previous image isn't available for the merge-base of your branch" 2024-12-18T03:39:01.4745949Z  echo " Will re-build docker image to store in local cache, TTS may be longer" 2024-12-18T03:39:01.4746885Z fi 2024-12-18T03:39:01.4747168Z  2024-12-18T03:39:01.4747458Z echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2024-12-18T03:39:01.4796574Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:39:01.4797261Z env: 2024-12-18T03:39:01.4797670Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:39:01.4798240Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:39:01.4799265Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:39:01.4800223Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:39:01.4800753Z AWS_REGION: us-east-1 2024-12-18T03:39:01.4801383Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:39:01.4802126Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:39:01.4812664Z AWS_SESSION_TOKEN: *** 2024-12-18T03:39:01.4813257Z DOCKER_BUILD_DIR: .ci/docker 2024-12-18T03:39:01.4814004Z BASE_REVISION: 0cdf8b1d09254cfda66191d1bd01e3041c3c76f7 2024-12-18T03:39:01.4815807Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-rocm-n-py3:45e1356b47a284893081276eff3000b7b534f3b1 2024-12-18T03:39:01.4816928Z DOCKER_TAG: 45e1356b47a284893081276eff3000b7b534f3b1 2024-12-18T03:39:01.4817371Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-12-18T03:39:01.4817737Z DOCKER_PUSH: 2024-12-18T03:39:01.4817955Z ##[endgroup] 2024-12-18T03:39:01.4880501Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-12-18T03:39:01.4880931Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-12-18T03:39:01.4887151Z + aws ecr get-login-password --region us-east-1 2024-12-18T03:39:01.4888228Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-12-18T03:39:02.9315587Z WARNING! Your password will be stored unencrypted in /home/pytorchci/.docker/config.json. 2024-12-18T03:39:02.9316714Z Configure a credential helper to remove this warning. See 2024-12-18T03:39:02.9317757Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2024-12-18T03:39:02.9318994Z 2024-12-18T03:39:02.9320055Z Login Succeeded 2024-12-18T03:39:02.9365662Z ++ date +%s 2024-12-18T03:39:02.9378881Z + START_TIME=1734493142 2024-12-18T03:39:02.9383547Z ++ date +%s 2024-12-18T03:39:02.9397710Z + [[ 1734487742 -lt 1734493142 ]] 2024-12-18T03:39:02.9398458Z + docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-rocm-n-py3:45e1356b47a284893081276eff3000b7b534f3b1 2024-12-18T03:39:04.2334264Z { 2024-12-18T03:39:04.2335058Z "schemaVersion": 2, 2024-12-18T03:39:04.2335950Z "mediaType": "application/vnd.docker.distribution.manifest.v2+json", 2024-12-18T03:39:04.2336897Z "config": { 2024-12-18T03:39:04.2337594Z "mediaType": "application/vnd.docker.container.image.v1+json", 2024-12-18T03:39:04.2338337Z "size": 24144, 2024-12-18T03:39:04.2339082Z "digest": "sha256:07b91116eb1063321da7a63f31f1ef61d34f7dc8b7d56ec47c24f12af9f38edb" 2024-12-18T03:39:04.2339986Z }, 2024-12-18T03:39:04.2340355Z "layers": [ 2024-12-18T03:39:04.2340762Z { 2024-12-18T03:39:04.2341360Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2342857Z "size": 28583948, 2024-12-18T03:39:04.2343622Z "digest": "sha256:86e5016c269355b382c9cabab4f6646d56d75914f20d545289970436dae431b1" 2024-12-18T03:39:04.2344436Z }, 2024-12-18T03:39:04.2344790Z { 2024-12-18T03:39:04.2345386Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2346157Z "size": 1825, 2024-12-18T03:39:04.2346923Z "digest": "sha256:7e5cdcc8d39cbfc3cbf5fe048da71a36c2c80d4a1e184cccc8c9f02e7e076f64" 2024-12-18T03:39:04.2347764Z }, 2024-12-18T03:39:04.2348114Z { 2024-12-18T03:39:04.2348691Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2349425Z "size": 312642362, 2024-12-18T03:39:04.2350165Z "digest": "sha256:231ab5ed5e979cc01fda602a87a43fd305466b7cc01b98a76690ffa0b716f312" 2024-12-18T03:39:04.2350974Z }, 2024-12-18T03:39:04.2351320Z { 2024-12-18T03:39:04.2351921Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2352766Z "size": 863, 2024-12-18T03:39:04.2353644Z "digest": "sha256:fcbefe9ad79ece7f701532eae83e9de139fe1c0949a7813e55dd3a53003af9cd" 2024-12-18T03:39:04.2354619Z }, 2024-12-18T03:39:04.2355022Z { 2024-12-18T03:39:04.2355686Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2356529Z "size": 106, 2024-12-18T03:39:04.2357399Z "digest": "sha256:af67ad6d287ebd8cbde55a749a5bbc06170aea40e4902c93cd34922e10c8c408" 2024-12-18T03:39:04.2358439Z }, 2024-12-18T03:39:04.2358779Z { 2024-12-18T03:39:04.2359340Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2360048Z "size": 704, 2024-12-18T03:39:04.2360765Z "digest": "sha256:a87dd7b9e020ba8d0d077d74c105bbb7fe8cdd8f1e6ae0c98b84c3e2c06ace60" 2024-12-18T03:39:04.2361584Z }, 2024-12-18T03:39:04.2361928Z { 2024-12-18T03:39:04.2362491Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2363229Z "size": 1257, 2024-12-18T03:39:04.2364088Z "digest": "sha256:aa67f5368a1ddc10ff5b57e69f4ebcf2f83e0f32dfe4648e2a21aee041f107f8" 2024-12-18T03:39:04.2365059Z }, 2024-12-18T03:39:04.2365465Z { 2024-12-18T03:39:04.2366107Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2366814Z "size": 3708, 2024-12-18T03:39:04.2367523Z "digest": "sha256:e71aede6d46007bf2a44f7cd14800cf5486c895c0d90a0b18ea9c30e6fff4571" 2024-12-18T03:39:04.2368337Z }, 2024-12-18T03:39:04.2368681Z { 2024-12-18T03:39:04.2369243Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2369955Z "size": 1860, 2024-12-18T03:39:04.2370666Z "digest": "sha256:ee65ff0efe5e4f68873b0389e60a87f13f1a1bb0df1178fd70e5e83e59753fab" 2024-12-18T03:39:04.2371477Z }, 2024-12-18T03:39:04.2371815Z { 2024-12-18T03:39:04.2372375Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2373403Z "size": 701, 2024-12-18T03:39:04.2374121Z "digest": "sha256:82c83b062d947fa6269438bde3cb9af1fdf56b6945be283407336d4cc3f16676" 2024-12-18T03:39:04.2375075Z }, 2024-12-18T03:39:04.2375435Z { 2024-12-18T03:39:04.2376003Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2376720Z "size": 2717169179, 2024-12-18T03:39:04.2377443Z "digest": "sha256:8362d69f07765b32e976c14dd6625372f9d81f263db1dc4ef06f46d99571f992" 2024-12-18T03:39:04.2378229Z }, 2024-12-18T03:39:04.2378565Z { 2024-12-18T03:39:04.2379120Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2379833Z "size": 380, 2024-12-18T03:39:04.2380519Z "digest": "sha256:0d29ca60b09574f6c0756a7025b88000a59b1009992fa1478637f5bdc5784ad2" 2024-12-18T03:39:04.2381301Z }, 2024-12-18T03:39:04.2381665Z { 2024-12-18T03:39:04.2382326Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2383170Z "size": 12135, 2024-12-18T03:39:04.2383997Z "digest": "sha256:6697412e0463180656f0051e70d45098fb97bbdb91515623a3ca0d0785ed26c3" 2024-12-18T03:39:04.2385293Z }, 2024-12-18T03:39:04.2385704Z { 2024-12-18T03:39:04.2386364Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2387212Z "size": 504, 2024-12-18T03:39:04.2388047Z "digest": "sha256:3fe85e92871bbecf1a51782c8349d745adf952dabe41e4023f16496e13c89d6e" 2024-12-18T03:39:04.2389249Z }, 2024-12-18T03:39:04.2389594Z { 2024-12-18T03:39:04.2390172Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2390940Z "size": 121477504, 2024-12-18T03:39:04.2391663Z "digest": "sha256:db8079b9a247c260d493898268f96390bcef45198f5799ea28a82894417d3409" 2024-12-18T03:39:04.2392456Z }, 2024-12-18T03:39:04.2392802Z { 2024-12-18T03:39:04.2393360Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2394074Z "size": 109, 2024-12-18T03:39:04.2394796Z "digest": "sha256:1e1cc236bfa694641ad0d2d7fb1a76caa46f0b5364084a30d576a593cdf09838" 2024-12-18T03:39:04.2395605Z }, 2024-12-18T03:39:04.2395950Z { 2024-12-18T03:39:04.2396518Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2397228Z "size": 489, 2024-12-18T03:39:04.2397933Z "digest": "sha256:5594c816d49f271fafd93fedc956a82586a1ddb33f6a81c6c47324b2bc675e6f" 2024-12-18T03:39:04.2398744Z }, 2024-12-18T03:39:04.2399089Z { 2024-12-18T03:39:04.2399651Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2400366Z "size": 296, 2024-12-18T03:39:04.2401042Z "digest": "sha256:d0419af25312d587b2c3532b144859e2155c14d2ca73864f20041dd565b0c9c7" 2024-12-18T03:39:04.2401835Z }, 2024-12-18T03:39:04.2402175Z { 2024-12-18T03:39:04.2402736Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2403445Z "size": 103, 2024-12-18T03:39:04.2404157Z "digest": "sha256:4a7e3c6b1985b14cd275d8480a86dabccb21244b38cca3771881b2f87a880a7c" 2024-12-18T03:39:04.2404965Z }, 2024-12-18T03:39:04.2405362Z { 2024-12-18T03:39:04.2405934Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2406638Z "size": 1474, 2024-12-18T03:39:04.2407342Z "digest": "sha256:e7e49899a1507f1355d8dd6a16d5beb03f1ced22428dc3b796a8c453fba8b41d" 2024-12-18T03:39:04.2408147Z }, 2024-12-18T03:39:04.2408490Z { 2024-12-18T03:39:04.2409045Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2409758Z "size": 430331313, 2024-12-18T03:39:04.2410474Z "digest": "sha256:907b5d385c1450d33d646dcce152dec74855431d5aa6d427f39d74f87ab7a183" 2024-12-18T03:39:04.2411279Z }, 2024-12-18T03:39:04.2411621Z { 2024-12-18T03:39:04.2412174Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2412879Z "size": 163, 2024-12-18T03:39:04.2413587Z "digest": "sha256:e09946f27e30eab5749280587f84a64accfe85ccfaabf422c1fae010dd492cbd" 2024-12-18T03:39:04.2414834Z }, 2024-12-18T03:39:04.2415197Z { 2024-12-18T03:39:04.2415774Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2416542Z "size": 1640, 2024-12-18T03:39:04.2417260Z "digest": "sha256:7294a9f24454fce9dc046e77ab28a73149429dde57793009bb3223eb424713de" 2024-12-18T03:39:04.2418065Z }, 2024-12-18T03:39:04.2418403Z { 2024-12-18T03:39:04.2418968Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2419689Z "size": 7880578701, 2024-12-18T03:39:04.2420438Z "digest": "sha256:5cbdd1d7b36d7b6e2d6428e2857ec4f8614e3becbdb0ec6e3b64f3cf63584e59" 2024-12-18T03:39:04.2421263Z }, 2024-12-18T03:39:04.2421612Z { 2024-12-18T03:39:04.2422175Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2422888Z "size": 105, 2024-12-18T03:39:04.2423580Z "digest": "sha256:f528471f2fcaa24ae54a666344201877b06d22561708fd21987ac10c0873d6a0" 2024-12-18T03:39:04.2424372Z }, 2024-12-18T03:39:04.2424721Z { 2024-12-18T03:39:04.2425280Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2426315Z "size": 1116, 2024-12-18T03:39:04.2427007Z "digest": "sha256:e1276511038b9b5b700d3a69400a0692e260be884871347905717fbec3009ced" 2024-12-18T03:39:04.2427786Z }, 2024-12-18T03:39:04.2428129Z { 2024-12-18T03:39:04.2428685Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2429405Z "size": 318568913, 2024-12-18T03:39:04.2430135Z "digest": "sha256:f3d817e0be62e96a38e3a6596705a35dfa864a2658a4d4e92890f192eaa1103e" 2024-12-18T03:39:04.2430937Z }, 2024-12-18T03:39:04.2431279Z { 2024-12-18T03:39:04.2431833Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2432549Z "size": 111, 2024-12-18T03:39:04.2433262Z "digest": "sha256:babfeddb429861eef35685a83ff84f1b120f8eda1b6a1c552453065dc0bd6cb3" 2024-12-18T03:39:04.2434083Z }, 2024-12-18T03:39:04.2434424Z { 2024-12-18T03:39:04.2434989Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2435701Z "size": 1555, 2024-12-18T03:39:04.2436433Z "digest": "sha256:e00cbec1ecdd51cc9b1b2471a6bf61f3e93cc03540114c13375e35cb48169f90" 2024-12-18T03:39:04.2437251Z }, 2024-12-18T03:39:04.2437591Z { 2024-12-18T03:39:04.2438145Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2438862Z "size": 107, 2024-12-18T03:39:04.2439564Z "digest": "sha256:60175e320c6277d6be778429e32b3dd271c508af81bb52bdfec8c5c6edb55a66" 2024-12-18T03:39:04.2440362Z }, 2024-12-18T03:39:04.2440697Z { 2024-12-18T03:39:04.2441248Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2441999Z "size": 166, 2024-12-18T03:39:04.2442844Z "digest": "sha256:e4fd8cd3183ce9d5e635650cd867e61f68b6edc31b37e605f9c038a2064e42c0" 2024-12-18T03:39:04.2443802Z }, 2024-12-18T03:39:04.2444202Z { 2024-12-18T03:39:04.2444849Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2445558Z "size": 2842293, 2024-12-18T03:39:04.2446283Z "digest": "sha256:f79cba221c4638d554fcf21bde3a813a5ddd7d4900f5928e43c92c9307e9d58d" 2024-12-18T03:39:04.2447095Z }, 2024-12-18T03:39:04.2447435Z { 2024-12-18T03:39:04.2447987Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2448700Z "size": 107, 2024-12-18T03:39:04.2449402Z "digest": "sha256:e425c457ab3e0a6637feed1c3ef95776f4c43fc59336a9da4e0a7f19858529c2" 2024-12-18T03:39:04.2450211Z }, 2024-12-18T03:39:04.2450552Z { 2024-12-18T03:39:04.2451109Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2451816Z "size": 566, 2024-12-18T03:39:04.2452525Z "digest": "sha256:6d2cbef9a0f53dff8893f40509e211ffe284a5d8fb4e03923b9c9646b06ad2e7" 2024-12-18T03:39:04.2453337Z }, 2024-12-18T03:39:04.2453676Z { 2024-12-18T03:39:04.2454232Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2455552Z "size": 43195564, 2024-12-18T03:39:04.2456306Z "digest": "sha256:2c4472ad27f9258eee43c37d8fdd0ec0db693fc1d33d03902c90d85850a2fa74" 2024-12-18T03:39:04.2457138Z }, 2024-12-18T03:39:04.2457476Z { 2024-12-18T03:39:04.2458029Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2458737Z "size": 106, 2024-12-18T03:39:04.2459444Z "digest": "sha256:48ffbd6d25729ef5a8cbfc2950686efcd5cc5854803572b079a40e7d59b81481" 2024-12-18T03:39:04.2460258Z }, 2024-12-18T03:39:04.2460604Z { 2024-12-18T03:39:04.2461163Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2461939Z "size": 295, 2024-12-18T03:39:04.2462781Z "digest": "sha256:cf56c8fabbc0842e7f589f295894c2b6eb980e7f5ec8f2232b8c105f503a46e5" 2024-12-18T03:39:04.2463747Z }, 2024-12-18T03:39:04.2464149Z { 2024-12-18T03:39:04.2464810Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2465662Z "size": 88293, 2024-12-18T03:39:04.2466517Z "digest": "sha256:b1c0b18891861b71d1c07dfb5e1063225b3ee635b5cf8d088bc62755ae2a3140" 2024-12-18T03:39:04.2467839Z }, 2024-12-18T03:39:04.2468239Z { 2024-12-18T03:39:04.2468796Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2469508Z "size": 106, 2024-12-18T03:39:04.2470217Z "digest": "sha256:3021fd4a208aeeab27bda896abedaba113f2a3845b3b037c57f855a429bcd61a" 2024-12-18T03:39:04.2471036Z }, 2024-12-18T03:39:04.2471371Z { 2024-12-18T03:39:04.2471928Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2472671Z "size": 1402, 2024-12-18T03:39:04.2473414Z "digest": "sha256:a6f5eec4a2df539ca39d87dfd6ba37ef1c60304eda99ed48b657823b8b5a83aa" 2024-12-18T03:39:04.2474406Z }, 2024-12-18T03:39:04.2474823Z { 2024-12-18T03:39:04.2475507Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2476363Z "size": 701, 2024-12-18T03:39:04.2477086Z "digest": "sha256:82c83b062d947fa6269438bde3cb9af1fdf56b6945be283407336d4cc3f16676" 2024-12-18T03:39:04.2477897Z }, 2024-12-18T03:39:04.2478244Z { 2024-12-18T03:39:04.2478813Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2479534Z "size": 137, 2024-12-18T03:39:04.2480250Z "digest": "sha256:89e8bdba21ffada6ab13d8b625b5349b0cc0398741ffd1c5155fdea44633255a" 2024-12-18T03:39:04.2481071Z }, 2024-12-18T03:39:04.2481414Z { 2024-12-18T03:39:04.2481973Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2482691Z "size": 120, 2024-12-18T03:39:04.2483390Z "digest": "sha256:7239a56eb3344ab4d10ba3a41fc1e7af5559c747ee876bb8660f0237ed85a2f0" 2024-12-18T03:39:04.2484195Z }, 2024-12-18T03:39:04.2484539Z { 2024-12-18T03:39:04.2485104Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2485829Z "size": 5014581981, 2024-12-18T03:39:04.2486567Z "digest": "sha256:22093b6a474afa19ed0f46a0491aa69688d1f0b3954fb7614a8f8e2538f90a43" 2024-12-18T03:39:04.2487368Z }, 2024-12-18T03:39:04.2487709Z { 2024-12-18T03:39:04.2488281Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2489001Z "size": 174, 2024-12-18T03:39:04.2489709Z "digest": "sha256:48aa755f3a17db6703cc7bff22b93fe216ee7f1112fb2e8330c2e1f7737822c2" 2024-12-18T03:39:04.2490517Z }, 2024-12-18T03:39:04.2490859Z { 2024-12-18T03:39:04.2491424Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2492143Z "size": 209, 2024-12-18T03:39:04.2492832Z "digest": "sha256:789ba2fe5a5187199259980262efd9e420b90ff489372e2cd60c98470a3db40d" 2024-12-18T03:39:04.2493630Z }, 2024-12-18T03:39:04.2493970Z { 2024-12-18T03:39:04.2494749Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2495490Z "size": 701, 2024-12-18T03:39:04.2496183Z "digest": "sha256:82c83b062d947fa6269438bde3cb9af1fdf56b6945be283407336d4cc3f16676" 2024-12-18T03:39:04.2497302Z }, 2024-12-18T03:39:04.2497681Z { 2024-12-18T03:39:04.2498236Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2498956Z "size": 633, 2024-12-18T03:39:04.2499638Z "digest": "sha256:e2e30e00626e69012828cfe72a0d31832f6ff31805e36d44356537d5f7fcb9ba" 2024-12-18T03:39:04.2500428Z }, 2024-12-18T03:39:04.2500768Z { 2024-12-18T03:39:04.2501324Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2502126Z "size": 296742339, 2024-12-18T03:39:04.2502976Z "digest": "sha256:c631a2525143e662e72b9b97c2e87b09a43a1038995e76b403dad6ba88966e41" 2024-12-18T03:39:04.2503916Z }, 2024-12-18T03:39:04.2504320Z { 2024-12-18T03:39:04.2504975Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2505817Z "size": 1932, 2024-12-18T03:39:04.2506667Z "digest": "sha256:08d3deeeb0fbaac4992048b95790661de095ddd8925926cfa2f2ec217bf4e0f7" 2024-12-18T03:39:04.2507642Z }, 2024-12-18T03:39:04.2508060Z { 2024-12-18T03:39:04.2508629Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2509673Z "size": 15983403, 2024-12-18T03:39:04.2510396Z "digest": "sha256:f098b0de0dbe9076c39d2aa97b15a08de76d7c33be71cc1069255184e8716618" 2024-12-18T03:39:04.2511209Z }, 2024-12-18T03:39:04.2511548Z { 2024-12-18T03:39:04.2512105Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-12-18T03:39:04.2512820Z "size": 54145664, 2024-12-18T03:39:04.2513564Z "digest": "sha256:4e73d9ed9f9510302a97c5ce44701037bd8453e91d3dbe54738434120cf00300" 2024-12-18T03:39:04.2514534Z } 2024-12-18T03:39:04.2514936Z ] 2024-12-18T03:39:04.2515345Z } 2024-12-18T03:39:04.2515774Z + exit 0 2024-12-18T03:39:04.2568386Z ##[group]Run set -eux 2024-12-18T03:39:04.2568897Z set -eux 2024-12-18T03:39:04.2570454Z aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token | jq --raw-output '.SecretString' | jq -r .docker_hub_readonly_token | docker login --username pytorchbot --password-stdin 2024-12-18T03:39:04.2629429Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:39:04.2630150Z env: 2024-12-18T03:39:04.2630560Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:39:04.2631133Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:39:04.2632143Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:39:04.2633111Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:39:04.2633739Z AWS_REGION: us-east-1 2024-12-18T03:39:04.2634465Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:39:04.2635276Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:39:04.2645750Z AWS_SESSION_TOKEN: *** 2024-12-18T03:39:04.2646220Z ##[endgroup] 2024-12-18T03:39:04.2739994Z + aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token 2024-12-18T03:39:04.2740965Z + jq --raw-output .SecretString 2024-12-18T03:39:04.2742962Z + jq -r .docker_hub_readonly_token 2024-12-18T03:39:04.2746758Z + docker login --username pytorchbot --password-stdin 2024-12-18T03:39:04.9600852Z 2024-12-18T03:39:04.9603716Z An error occurred (AccessDeniedException) when calling the GetSecretValue operation: User: arn:aws:sts::308535385114:assumed-role/gha_workflow_s3_and_ecr_read_only/GitHubActions is not authorized to perform: secretsmanager:GetSecretValue on resource: docker_hub_readonly_token because no identity-based policy allows the secretsmanager:GetSecretValue action 2024-12-18T03:39:05.0136620Z Error: Cannot perform an interactive login from a non TTY device 2024-12-18T03:39:05.0184791Z ##[error]Process completed with exit code 1. 2024-12-18T03:39:05.0326683Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@release/2.6 2024-12-18T03:39:05.0327565Z with: 2024-12-18T03:39:05.0328691Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-rocm-n-py3:45e1356b47a284893081276eff3000b7b534f3b1 2024-12-18T03:39:05.0330126Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-12-18T03:39:05.0330844Z env: 2024-12-18T03:39:05.0331258Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:39:05.0331887Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:39:05.0332912Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:39:05.0333879Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:39:05.0334417Z AWS_REGION: us-east-1 2024-12-18T03:39:05.0335245Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:39:05.0335992Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:39:05.0347256Z AWS_SESSION_TOKEN: *** 2024-12-18T03:39:05.0347634Z ##[endgroup] 2024-12-18T03:39:05.0372886Z ##[group]Run set -x 2024-12-18T03:39:05.0373441Z set -x 2024-12-18T03:39:05.0373873Z set +e 2024-12-18T03:39:05.0374284Z  2024-12-18T03:39:05.0374818Z login() { 2024-12-18T03:39:05.0375711Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2024-12-18T03:39:05.0376630Z } 2024-12-18T03:39:05.0377020Z  2024-12-18T03:39:05.0377410Z retry () { 2024-12-18T03:39:05.0378379Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2024-12-18T03:39:05.0378955Z } 2024-12-18T03:39:05.0379349Z  2024-12-18T03:39:05.0379774Z retry login "${DOCKER_REGISTRY}" 2024-12-18T03:39:05.0380323Z  2024-12-18T03:39:05.0380707Z set -e 2024-12-18T03:39:05.0381330Z # ignore output since only exit code is used for conditional 2024-12-18T03:39:05.0382375Z # only pull docker image if it's not available locally 2024-12-18T03:39:05.0383511Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2024-12-18T03:39:05.0384551Z  retry docker pull "${DOCKER_IMAGE}" 2024-12-18T03:39:05.0385231Z fi 2024-12-18T03:39:05.0435345Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:39:05.0436114Z env: 2024-12-18T03:39:05.0436613Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:39:05.0437295Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:39:05.0438416Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:39:05.0439359Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:39:05.0439895Z AWS_REGION: us-east-1 2024-12-18T03:39:05.0440491Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:39:05.0441192Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:39:05.0451394Z AWS_SESSION_TOKEN: *** 2024-12-18T03:39:05.0452593Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-rocm-n-py3:45e1356b47a284893081276eff3000b7b534f3b1 2024-12-18T03:39:05.0453992Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-12-18T03:39:05.0454870Z ##[endgroup] 2024-12-18T03:39:05.0522455Z + set +e 2024-12-18T03:39:05.0523067Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-12-18T03:39:05.0523897Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-12-18T03:39:05.0528446Z + aws ecr get-login-password --region us-east-1 2024-12-18T03:39:05.0529836Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-12-18T03:39:06.4956760Z WARNING! Your password will be stored unencrypted in /home/pytorchci/.docker/config.json. 2024-12-18T03:39:06.4957895Z Configure a credential helper to remove this warning. See 2024-12-18T03:39:06.4958927Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2024-12-18T03:39:06.4959617Z 2024-12-18T03:39:06.4996697Z Login Succeeded 2024-12-18T03:39:06.5038368Z + set -e 2024-12-18T03:39:06.5040369Z + docker inspect --type=image 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-rocm-n-py3:45e1356b47a284893081276eff3000b7b534f3b1 2024-12-18T03:39:06.5407720Z Prepare all required actions 2024-12-18T03:39:06.5408474Z Getting action download info 2024-12-18T03:39:06.7881353Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:1da556a7aa0a088e3153970611f6c432d58e80e6) 2024-12-18T03:39:07.4552983Z Download action repository 'actions/download-artifact@v4' (SHA:fa0a91b85d4f404e444e00e005971372dc801d16) 2024-12-18T03:39:08.0562909Z ##[group]Run ./.github/actions/download-build-artifacts 2024-12-18T03:39:08.0563414Z with: 2024-12-18T03:39:08.0563751Z name: linux-focal-rocm6.2-py3.10 2024-12-18T03:39:08.0564173Z s3-bucket: gha-artifacts 2024-12-18T03:39:08.0564528Z env: 2024-12-18T03:39:08.0564832Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:39:08.0565249Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:39:08.0566020Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:39:08.0566731Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:39:08.0567193Z AWS_REGION: us-east-1 2024-12-18T03:39:08.0567754Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:39:08.0568286Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:39:08.0576120Z AWS_SESSION_TOKEN: *** 2024-12-18T03:39:08.0576561Z ##[endgroup] 2024-12-18T03:39:08.0621212Z ##[group]Run seemethere/download-artifact-s3@v4 2024-12-18T03:39:08.0621682Z with: 2024-12-18T03:39:08.0622354Z name: linux-focal-rocm6.2-py3.10 2024-12-18T03:39:08.0622782Z s3-bucket: gha-artifacts 2024-12-18T03:39:08.0623164Z region: us-east-1 2024-12-18T03:39:08.0623485Z env: 2024-12-18T03:39:08.0623784Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:39:08.0624205Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:39:08.0624970Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:39:08.0625706Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:39:08.0626106Z AWS_REGION: us-east-1 2024-12-18T03:39:08.0626549Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:39:08.0627080Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:39:08.0635206Z AWS_SESSION_TOKEN: *** 2024-12-18T03:39:08.0635569Z ##[endgroup] 2024-12-18T03:39:08.4681387Z (node:1230102) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2024-12-18T03:39:08.4682152Z 2024-12-18T03:39:08.4682464Z Please migrate your code to use AWS SDK for JavaScript (v3). 2024-12-18T03:39:08.4683328Z For more information, check the migration guide at https://a.co/7PzMCcy 2024-12-18T03:39:08.4684156Z (Use `node --trace-warnings ...` to show where the warning was created) 2024-12-18T03:39:08.7375198Z Found 1 objects with prefix pytorch/pytorch/12383255654/linux-focal-rocm6.2-py3.10/ 2024-12-18T03:39:08.7376541Z Starting download (1/1): /home/pytorchci/actions-runner/_work/pytorch/pytorch/artifacts.zip 2024-12-18T03:40:02.1877605Z Finished download (1/1): /home/pytorchci/actions-runner/_work/pytorch/pytorch/artifacts.zip 2024-12-18T03:40:02.1896260Z Artifact download has finished successfully 2024-12-18T03:40:02.2428849Z ##[group]Run unzip -o artifacts.zip 2024-12-18T03:40:02.2429196Z unzip -o artifacts.zip 2024-12-18T03:40:02.2474829Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:40:02.2475388Z env: 2024-12-18T03:40:02.2475711Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:40:02.2476164Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:40:02.2476970Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:40:02.2477708Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:40:02.2478123Z AWS_REGION: us-east-1 2024-12-18T03:40:02.2478588Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:40:02.2479145Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:40:02.2487213Z AWS_SESSION_TOKEN: *** 2024-12-18T03:40:02.2487583Z ##[endgroup] 2024-12-18T03:40:02.2585649Z Archive: artifacts.zip 2024-12-18T03:40:02.2586289Z creating: dist/ 2024-12-18T03:40:04.8361414Z inflating: dist/torch-2.6.0a0+git0cdf8b1-cp310-cp310-linux_x86_64.whl 2024-12-18T03:40:04.8484948Z inflating: dist/.ninja_log 2024-12-18T03:40:04.8488148Z creating: build/custom_test_artifacts/ 2024-12-18T03:40:04.8489726Z creating: build/custom_test_artifacts/custom-op-build/ 2024-12-18T03:40:04.8490243Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2024-12-18T03:40:04.8490884Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/pkgRedirects/ 2024-12-18T03:40:04.8491529Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeConfigureLog.yaml 2024-12-18T03:40:04.8492128Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/ 2024-12-18T03:40:04.8492726Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeSystem.cmake 2024-12-18T03:40:04.8493360Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdC/ 2024-12-18T03:40:04.8493985Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdC/tmp/ 2024-12-18T03:40:04.8494840Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdC/CMakeCCompilerId.c 2024-12-18T03:40:04.8495559Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdC/a.out 2024-12-18T03:40:04.8496197Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCXX/ 2024-12-18T03:40:04.8497222Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCXX/tmp/ 2024-12-18T03:40:04.8497965Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCXX/CMakeCXXCompilerId.cpp 2024-12-18T03:40:04.8498718Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCXX/a.out 2024-12-18T03:40:04.8499451Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_C.bin 2024-12-18T03:40:04.8500233Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeCCompiler.cmake 2024-12-18T03:40:04.8500995Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_CXX.bin 2024-12-18T03:40:04.8501618Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeCXXCompiler.cmake 2024-12-18T03:40:04.8502126Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeScratch/ 2024-12-18T03:40:04.8502575Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2024-12-18T03:40:04.8503035Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2024-12-18T03:40:04.8503513Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2024-12-18T03:40:04.8504047Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2024-12-18T03:40:04.8504648Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2024-12-18T03:40:04.8505229Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2024-12-18T03:40:04.8505768Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2024-12-18T03:40:04.8506319Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2024-12-18T03:40:04.8506875Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2024-12-18T03:40:04.8507427Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2024-12-18T03:40:04.8507979Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2024-12-18T03:40:04.8508524Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2024-12-18T03:40:04.8526682Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2024-12-18T03:40:04.8658614Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2024-12-18T03:40:04.8659331Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2024-12-18T03:40:04.8660067Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2024-12-18T03:40:04.8660846Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2024-12-18T03:40:04.8661466Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2024-12-18T03:40:04.8662039Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2024-12-18T03:40:04.8662630Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2024-12-18T03:40:04.8663219Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2024-12-18T03:40:04.8663815Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2024-12-18T03:40:04.8664396Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2024-12-18T03:40:04.8664985Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2024-12-18T03:40:04.8683026Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2024-12-18T03:40:04.8758514Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2024-12-18T03:40:04.8759351Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2024-12-18T03:40:04.8760077Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2024-12-18T03:40:04.8760747Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2024-12-18T03:40:04.8761346Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2024-12-18T03:40:04.8761946Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2024-12-18T03:40:04.8762513Z inflating: build/custom_test_artifacts/custom-op-build/hip_new_types.cc 2024-12-18T03:40:04.8763080Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2024-12-18T03:40:04.8763730Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2024-12-18T03:40:04.8764254Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2024-12-18T03:40:04.8872632Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2024-12-18T03:40:04.8929631Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2024-12-18T03:40:04.8930664Z creating: build/custom_test_artifacts/jit-hook-build/ 2024-12-18T03:40:04.8931576Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2024-12-18T03:40:04.8932621Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/pkgRedirects/ 2024-12-18T03:40:04.8933883Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeConfigureLog.yaml 2024-12-18T03:40:04.8935209Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/ 2024-12-18T03:40:04.8936393Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeSystem.cmake 2024-12-18T03:40:04.8937647Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdC/ 2024-12-18T03:40:04.8938828Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdC/tmp/ 2024-12-18T03:40:04.8940247Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdC/CMakeCCompilerId.c 2024-12-18T03:40:04.8941672Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdC/a.out 2024-12-18T03:40:04.8943369Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCXX/ 2024-12-18T03:40:04.8944594Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCXX/tmp/ 2024-12-18T03:40:04.8946019Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCXX/CMakeCXXCompilerId.cpp 2024-12-18T03:40:04.8947510Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCXX/a.out 2024-12-18T03:40:04.8948924Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_C.bin 2024-12-18T03:40:04.8950333Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeCCompiler.cmake 2024-12-18T03:40:04.8951749Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_CXX.bin 2024-12-18T03:40:04.8953185Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeCXXCompiler.cmake 2024-12-18T03:40:04.8954405Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeScratch/ 2024-12-18T03:40:04.8955455Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2024-12-18T03:40:04.8956558Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2024-12-18T03:40:04.8957721Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2024-12-18T03:40:04.8959417Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2024-12-18T03:40:04.8960900Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2024-12-18T03:40:04.8962327Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2024-12-18T03:40:04.8963653Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2024-12-18T03:40:04.8965027Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2024-12-18T03:40:04.8966422Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2024-12-18T03:40:04.8967823Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2024-12-18T03:40:04.8969212Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2024-12-18T03:40:04.8970583Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2024-12-18T03:40:04.8972060Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2024-12-18T03:40:04.9030246Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2024-12-18T03:40:04.9031869Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2024-12-18T03:40:04.9033281Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2024-12-18T03:40:04.9034551Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2024-12-18T03:40:04.9035690Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2024-12-18T03:40:04.9036808Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2024-12-18T03:40:04.9037893Z inflating: build/custom_test_artifacts/jit-hook-build/hip_new_types.cc 2024-12-18T03:40:04.9038893Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2024-12-18T03:40:04.9039842Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2024-12-18T03:40:04.9040817Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2024-12-18T03:40:04.9080179Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2024-12-18T03:40:04.9081227Z creating: build/custom_test_artifacts/custom-backend-build/ 2024-12-18T03:40:04.9082670Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2024-12-18T03:40:04.9083878Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/pkgRedirects/ 2024-12-18T03:40:04.9085241Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeConfigureLog.yaml 2024-12-18T03:40:04.9086501Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/ 2024-12-18T03:40:04.9087734Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeSystem.cmake 2024-12-18T03:40:04.9089062Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdC/ 2024-12-18T03:40:04.9090346Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdC/tmp/ 2024-12-18T03:40:04.9091820Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdC/CMakeCCompilerId.c 2024-12-18T03:40:04.9093309Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdC/a.out 2024-12-18T03:40:04.9094782Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCXX/ 2024-12-18T03:40:04.9096103Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCXX/tmp/ 2024-12-18T03:40:04.9097965Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCXX/CMakeCXXCompilerId.cpp 2024-12-18T03:40:04.9099540Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCXX/a.out 2024-12-18T03:40:04.9101046Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_C.bin 2024-12-18T03:40:04.9102563Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeCCompiler.cmake 2024-12-18T03:40:04.9104102Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_CXX.bin 2024-12-18T03:40:04.9105663Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeCXXCompiler.cmake 2024-12-18T03:40:04.9106978Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeScratch/ 2024-12-18T03:40:04.9108126Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2024-12-18T03:40:04.9109319Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2024-12-18T03:40:04.9110578Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2024-12-18T03:40:04.9111999Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2024-12-18T03:40:04.9113601Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2024-12-18T03:40:04.9115159Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2024-12-18T03:40:04.9116635Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2024-12-18T03:40:04.9118129Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2024-12-18T03:40:04.9119653Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2024-12-18T03:40:04.9121155Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2024-12-18T03:40:04.9122641Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2024-12-18T03:40:04.9124109Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2024-12-18T03:40:04.9125686Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2024-12-18T03:40:04.9218430Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2024-12-18T03:40:04.9219979Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2024-12-18T03:40:04.9221486Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2024-12-18T03:40:04.9223168Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2024-12-18T03:40:04.9224785Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2024-12-18T03:40:04.9226306Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2024-12-18T03:40:04.9227870Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2024-12-18T03:40:04.9229468Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2024-12-18T03:40:04.9231071Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2024-12-18T03:40:04.9232663Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2024-12-18T03:40:04.9234530Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2024-12-18T03:40:04.9242222Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2024-12-18T03:40:04.9292228Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2024-12-18T03:40:04.9293949Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2024-12-18T03:40:04.9295572Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2024-12-18T03:40:04.9296894Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2024-12-18T03:40:04.9298098Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2024-12-18T03:40:04.9299299Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2024-12-18T03:40:04.9300471Z inflating: build/custom_test_artifacts/custom-backend-build/hip_new_types.cc 2024-12-18T03:40:04.9301587Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2024-12-18T03:40:04.9302630Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2024-12-18T03:40:04.9303689Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2024-12-18T03:40:04.9388181Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2024-12-18T03:40:04.9426415Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2024-12-18T03:40:04.9427283Z creating: build/lib/ 2024-12-18T03:40:04.9505885Z inflating: build/lib/libprotobuf-lite.a 2024-12-18T03:40:04.9514375Z inflating: build/lib/libpthreadpool.a 2024-12-18T03:40:04.9521567Z inflating: build/lib/libcpuinfo.a 2024-12-18T03:40:04.9528526Z inflating: build/lib/libcpuinfo_internals.a 2024-12-18T03:40:04.9937111Z inflating: build/lib/libprotobuf.a 2024-12-18T03:40:05.0387237Z inflating: build/lib/libprotoc.a 2024-12-18T03:40:05.0387934Z inflating: build/lib/libclog.a 2024-12-18T03:40:05.0405116Z inflating: build/lib/libpytorch_qnnpack.a 2024-12-18T03:40:05.0406143Z inflating: build/lib/libnnpack_reference_layers.a 2024-12-18T03:40:05.0423158Z inflating: build/lib/libnnpack.a 2024-12-18T03:40:05.0589012Z inflating: build/lib/libmicrokernels-prod.a 2024-12-18T03:40:05.1358940Z inflating: build/lib/libmicrokernels-all.a 2024-12-18T03:40:05.1420290Z inflating: build/lib/libgtest.a 2024-12-18T03:40:05.1436367Z inflating: build/lib/libgmock.a 2024-12-18T03:40:05.1437124Z inflating: build/lib/libgtest_main.a 2024-12-18T03:40:05.1437762Z inflating: build/lib/libgmock_main.a 2024-12-18T03:40:05.1503742Z inflating: build/lib/libbenchmark.a 2024-12-18T03:40:05.1504481Z inflating: build/lib/libbenchmark_main.a 2024-12-18T03:40:05.1583363Z inflating: build/lib/libXNNPACK.a 2024-12-18T03:40:05.1590145Z inflating: build/lib/libittnotify.a 2024-12-18T03:40:05.1646415Z inflating: build/lib/libasmjit.a 2024-12-18T03:40:05.2783504Z inflating: build/lib/libfbgemm.a 2024-12-18T03:40:05.2808056Z inflating: build/lib/libtensorpipe_uv.a 2024-12-18T03:40:05.3325124Z inflating: build/lib/libtensorpipe.a 2024-12-18T03:40:05.3363216Z inflating: build/lib/libonnx_proto.a 2024-12-18T03:40:05.3464699Z inflating: build/lib/libgloo.a 2024-12-18T03:40:05.3816240Z inflating: build/lib/libgloo_hip.a 2024-12-18T03:40:05.4473740Z inflating: build/lib/libonnx.a 2024-12-18T03:40:05.4492057Z inflating: build/lib/libfmt.a 2024-12-18T03:40:05.4782810Z inflating: build/lib/libkineto.a 2024-12-18T03:40:06.4083773Z inflating: build/lib/libdnnl.a 2024-12-18T03:40:06.4171491Z inflating: build/lib/libc10.so 2024-12-18T03:40:06.4172250Z inflating: build/lib/libtorch_global_deps.so 2024-12-18T03:40:06.4215201Z inflating: build/lib/libc10_hip.so 2024-12-18T03:40:06.4215949Z inflating: build/lib/libcaffe2_nvrtc.so 2024-12-18T03:40:08.6616795Z inflating: build/lib/libtorch_cpu.so 2024-12-18T03:40:08.6621067Z inflating: build/lib/libunbox_lib.a 2024-12-18T03:40:08.6624699Z inflating: build/lib/libshm.so 2024-12-18T03:40:09.8385519Z inflating: build/lib/libtorch_hip.so 2024-12-18T03:40:09.8386307Z inflating: build/lib/libtorch.so 2024-12-18T03:40:09.8404790Z inflating: build/lib/libjitbackend_test.so 2024-12-18T03:40:09.8466896Z inflating: build/lib/libtorchbind_test.so 2024-12-18T03:40:09.8488616Z inflating: build/lib/libbackend_with_compiler.so 2024-12-18T03:40:09.8511072Z inflating: build/lib/libaoti_custom_ops.so 2024-12-18T03:40:10.0348297Z inflating: build/lib/libtorch_python.so 2024-12-18T03:40:10.0379011Z inflating: build/lib/libnnapi_backend.so 2024-12-18T03:40:10.0379708Z creating: build/bin/ 2024-12-18T03:40:10.0380216Z creating: build/bin/CMakeFiles/ 2024-12-18T03:40:10.0380841Z inflating: build/bin/cmake_install.cmake 2024-12-18T03:40:10.0381511Z inflating: build/bin/CTestTestfile.cmake 2024-12-18T03:40:10.0775596Z inflating: build/bin/protoc-3.13.0.0 2024-12-18T03:40:10.1168895Z inflating: build/bin/protoc 2024-12-18T03:40:10.1216182Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2024-12-18T03:40:10.1264381Z inflating: build/bin/c10_DeviceGuard_test 2024-12-18T03:40:10.1313026Z inflating: build/bin/c10_Device_test 2024-12-18T03:40:10.1368756Z inflating: build/bin/c10_DispatchKeySet_test 2024-12-18T03:40:10.1419257Z inflating: build/bin/c10_Scalar_test 2024-12-18T03:40:10.1465136Z inflating: build/bin/c10_StreamGuard_test 2024-12-18T03:40:10.1512981Z inflating: build/bin/c10_SymInt_test 2024-12-18T03:40:10.1563558Z inflating: build/bin/c10_InlineDeviceGuard_test 2024-12-18T03:40:10.1615616Z inflating: build/bin/c10_InlineStreamGuard_test 2024-12-18T03:40:10.1668224Z inflating: build/bin/c10_SizesAndStrides_test 2024-12-18T03:40:10.1733492Z inflating: build/bin/c10_cow_test 2024-12-18T03:40:10.1779427Z inflating: build/bin/c10_ConstexprCrc_test 2024-12-18T03:40:10.1828930Z inflating: build/bin/c10_Bitset_test 2024-12-18T03:40:10.1875467Z inflating: build/bin/c10_ArrayRef_test 2024-12-18T03:40:10.1928022Z inflating: build/bin/c10_LeftRight_test 2024-12-18T03:40:10.1975276Z inflating: build/bin/c10_DeadlockDetection_test 2024-12-18T03:40:10.2022595Z inflating: build/bin/c10_Half_test 2024-12-18T03:40:10.2074110Z inflating: build/bin/c10_Metaprogramming_test 2024-12-18T03:40:10.2124406Z inflating: build/bin/c10_NetworkFlow_test 2024-12-18T03:40:10.2170618Z inflating: build/bin/c10_Synchronized_test 2024-12-18T03:40:10.2219049Z inflating: build/bin/c10_TypeIndex_test 2024-12-18T03:40:10.2270921Z inflating: build/bin/c10_ThreadLocal_test 2024-12-18T03:40:10.2319035Z inflating: build/bin/c10_TypeList_test 2024-12-18T03:40:10.2364934Z inflating: build/bin/c10_TypeTraits_test 2024-12-18T03:40:10.2413448Z inflating: build/bin/c10_accumulate_test 2024-12-18T03:40:10.2465676Z inflating: build/bin/c10_bfloat16_test 2024-12-18T03:40:10.2512325Z inflating: build/bin/c10_bit_cast_test 2024-12-18T03:40:10.2565289Z inflating: build/bin/c10_complex_test 2024-12-18T03:40:10.2618415Z inflating: build/bin/c10_complex_math_test 2024-12-18T03:40:10.2664980Z inflating: build/bin/c10_error_test 2024-12-18T03:40:10.2714301Z inflating: build/bin/c10_exception_test 2024-12-18T03:40:10.2761711Z inflating: build/bin/c10_flags_test 2024-12-18T03:40:10.2809060Z inflating: build/bin/c10_generic_math_test 2024-12-18T03:40:10.2859039Z inflating: build/bin/c10_lazy_test 2024-12-18T03:40:10.2906760Z inflating: build/bin/c10_irange_test 2024-12-18T03:40:10.3059440Z inflating: build/bin/c10_intrusive_ptr_test 2024-12-18T03:40:10.3112819Z inflating: build/bin/c10_logging_test 2024-12-18T03:40:10.3183426Z inflating: build/bin/c10_optional_test 2024-12-18T03:40:10.3241937Z inflating: build/bin/c10_ordered_preserving_dict_test 2024-12-18T03:40:10.3291866Z inflating: build/bin/c10_registry_test 2024-12-18T03:40:10.3434395Z inflating: build/bin/c10_small_vector_test 2024-12-18T03:40:10.3482932Z inflating: build/bin/c10_string_util_test 2024-12-18T03:40:10.3531462Z inflating: build/bin/c10_ssize_test 2024-12-18T03:40:10.3587312Z inflating: build/bin/c10_string_view_test 2024-12-18T03:40:10.3634702Z inflating: build/bin/c10_tempfile_test 2024-12-18T03:40:10.3677438Z inflating: build/bin/c10_intrusive_ptr_benchmark 2024-12-18T03:40:10.3730392Z inflating: build/bin/c10_typeid_test 2024-12-18T03:40:10.3776397Z inflating: build/bin/c10_hip_HIPAssertionsTest_1_var_test 2024-12-18T03:40:10.3822213Z inflating: build/bin/c10_hip_HIPAssertionsTest_catches_stream 2024-12-18T03:40:10.3868139Z inflating: build/bin/c10_hip_HIPAssertionsTest_catches_thread_and_block_and_device 2024-12-18T03:40:10.3913810Z inflating: build/bin/c10_hip_HIPAssertionsTest_from_2_processes 2024-12-18T03:40:10.3959716Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_blocks_and_threads 2024-12-18T03:40:10.4005403Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_multiple_blocks 2024-12-18T03:40:10.4051168Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_same_block 2024-12-18T03:40:10.4097253Z inflating: build/bin/c10_hip_HIPTest 2024-12-18T03:40:10.4455482Z inflating: build/bin/vec_test_all_types_DEFAULT 2024-12-18T03:40:10.4822532Z inflating: build/bin/vec_test_all_types_AVX512 2024-12-18T03:40:10.5201509Z inflating: build/bin/vec_test_all_types_AVX2 2024-12-18T03:40:10.5251860Z inflating: build/bin/test_edge_op_registration 2024-12-18T03:40:10.5302116Z inflating: build/bin/BackoffTest 2024-12-18T03:40:10.5351712Z inflating: build/bin/FileStoreTest 2024-12-18T03:40:10.5403452Z inflating: build/bin/TCPStoreTest 2024-12-18T03:40:10.5453190Z inflating: build/bin/HashStoreTest 2024-12-18T03:40:10.5514313Z inflating: build/bin/ProcessGroupGlooTest 2024-12-18T03:40:10.5515941Z inflating: build/bin/example_allreduce 2024-12-18T03:40:10.5569135Z inflating: build/bin/static_runtime_bench 2024-12-18T03:40:10.5798535Z inflating: build/bin/static_runtime_test 2024-12-18T03:40:10.5868611Z inflating: build/bin/Dict_test 2024-12-18T03:40:10.5914445Z inflating: build/bin/hip_half_test 2024-12-18T03:40:10.5963150Z inflating: build/bin/Dimname_test 2024-12-18T03:40:10.6023599Z inflating: build/bin/MaybeOwned_test 2024-12-18T03:40:10.6078105Z inflating: build/bin/NamedTensor_test 2024-12-18T03:40:10.6132451Z inflating: build/bin/apply_utils_test 2024-12-18T03:40:10.6187881Z inflating: build/bin/atest 2024-12-18T03:40:10.6246482Z inflating: build/bin/basic 2024-12-18T03:40:10.6297919Z inflating: build/bin/broadcast_test 2024-12-18T03:40:10.6345790Z inflating: build/bin/cpu_allocator_test 2024-12-18T03:40:10.6395869Z inflating: build/bin/cpu_profiling_allocator_test 2024-12-18T03:40:10.6451350Z inflating: build/bin/cpu_generator_test 2024-12-18T03:40:10.6536521Z inflating: build/bin/cpu_rng_test 2024-12-18T03:40:10.6583614Z inflating: build/bin/dispatch_key_set_test 2024-12-18T03:40:10.6631164Z inflating: build/bin/dlconvertor_test 2024-12-18T03:40:10.6685413Z inflating: build/bin/extension_backend_test 2024-12-18T03:40:10.6737150Z inflating: build/bin/half_test 2024-12-18T03:40:10.6825217Z inflating: build/bin/ivalue_test 2024-12-18T03:40:10.6871945Z inflating: build/bin/lazy_tensor_test 2024-12-18T03:40:10.6922401Z inflating: build/bin/math_kernel_test 2024-12-18T03:40:10.6970081Z inflating: build/bin/operator_name_test 2024-12-18T03:40:10.7020664Z inflating: build/bin/memory_format_test 2024-12-18T03:40:10.7070737Z inflating: build/bin/memory_overlapping_test 2024-12-18T03:40:10.7123258Z inflating: build/bin/native_test 2024-12-18T03:40:10.7173061Z inflating: build/bin/mobile_memory_cleanup 2024-12-18T03:40:10.7221264Z inflating: build/bin/operators_test 2024-12-18T03:40:10.7270076Z inflating: build/bin/packedtensoraccessor_test 2024-12-18T03:40:10.7332217Z inflating: build/bin/pow_test 2024-12-18T03:40:10.7379224Z inflating: build/bin/reduce_ops_test 2024-12-18T03:40:10.7432918Z inflating: build/bin/quantized_test 2024-12-18T03:40:10.7480877Z inflating: build/bin/reportMemoryUsage_test 2024-12-18T03:40:10.7534183Z inflating: build/bin/scalar_tensor_test 2024-12-18T03:40:10.7582710Z inflating: build/bin/StorageUtils_test 2024-12-18T03:40:10.7637549Z inflating: build/bin/scalar_test 2024-12-18T03:40:10.7687096Z inflating: build/bin/stride_properties_test 2024-12-18T03:40:10.7760939Z inflating: build/bin/tensor_iterator_test 2024-12-18T03:40:10.7811857Z inflating: build/bin/test_parallel 2024-12-18T03:40:10.7863764Z inflating: build/bin/type_ptr_test 2024-12-18T03:40:10.7865427Z inflating: build/bin/thread_init_test 2024-12-18T03:40:10.7866925Z inflating: build/bin/verify_api_visibility 2024-12-18T03:40:10.7916962Z inflating: build/bin/undefined_tensor_test 2024-12-18T03:40:10.7972561Z inflating: build/bin/type_test 2024-12-18T03:40:10.8037647Z inflating: build/bin/legacy_vmap_test 2024-12-18T03:40:10.8085884Z inflating: build/bin/weakref_test 2024-12-18T03:40:10.8184934Z inflating: build/bin/List_test 2024-12-18T03:40:10.8233486Z inflating: build/bin/wrapdim_test 2024-12-18T03:40:10.8289235Z inflating: build/bin/IListRef_test 2024-12-18T03:40:10.8337888Z inflating: build/bin/xla_tensor_test 2024-12-18T03:40:10.8449105Z inflating: build/bin/kernel_function_legacy_test 2024-12-18T03:40:10.8537929Z inflating: build/bin/kernel_function_test 2024-12-18T03:40:10.8655317Z inflating: build/bin/kernel_lambda_legacy_test 2024-12-18T03:40:10.8750489Z inflating: build/bin/kernel_lambda_test 2024-12-18T03:40:10.8812858Z inflating: build/bin/KernelFunction_test 2024-12-18T03:40:10.8869992Z inflating: build/bin/kernel_stackbased_test 2024-12-18T03:40:10.8918090Z inflating: build/bin/CppSignature_test 2024-12-18T03:40:10.9006538Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2024-12-18T03:40:10.9052294Z inflating: build/bin/op_allowlist_test 2024-12-18T03:40:10.9111752Z inflating: build/bin/inline_container_test 2024-12-18T03:40:10.9386678Z inflating: build/bin/op_registration_test 2024-12-18T03:40:10.9438728Z inflating: build/bin/backend_fallback_test 2024-12-18T03:40:10.9484567Z inflating: build/bin/hip_complex_math_test 2024-12-18T03:40:10.9534795Z inflating: build/bin/hip_apply_test 2024-12-18T03:40:10.9580164Z inflating: build/bin/hip_complex_test 2024-12-18T03:40:10.9626052Z inflating: build/bin/hip_distributions_test 2024-12-18T03:40:10.9671846Z inflating: build/bin/hip_generator_test 2024-12-18T03:40:10.9720071Z inflating: build/bin/hip_dlconvertor_test 2024-12-18T03:40:10.9765934Z inflating: build/bin/hip_integer_divider_test 2024-12-18T03:40:10.9812379Z inflating: build/bin/hip_optional_test 2024-12-18T03:40:10.9858364Z inflating: build/bin/hip_packedtensoraccessor_test 2024-12-18T03:40:10.9904092Z inflating: build/bin/hip_vectorized_test 2024-12-18T03:40:11.0421439Z inflating: build/bin/test_jit 2024-12-18T03:40:11.1134842Z inflating: build/bin/test_tensorexpr 2024-12-18T03:40:11.1147621Z inflating: build/bin/tutorial_tensorexpr 2024-12-18T03:40:11.1199032Z inflating: build/bin/test_dist_autograd 2024-12-18T03:40:11.1262912Z inflating: build/bin/test_cpp_rpc 2024-12-18T03:40:11.1264441Z inflating: build/bin/parallel_benchmark 2024-12-18T03:40:11.2317113Z inflating: build/bin/test_api 2024-12-18T03:40:11.2325139Z inflating: build/bin/aot_model_compiler_test 2024-12-18T03:40:11.2387219Z inflating: build/bin/test_mobile_nnc 2024-12-18T03:40:11.2703606Z inflating: build/bin/test_lazy 2024-12-18T03:40:11.2706850Z inflating: build/bin/torch_shm_manager 2024-12-18T03:40:11.2707523Z creating: .additional_ci_files/ 2024-12-18T03:40:11.2787466Z inflating: .additional_ci_files/test-times.json 2024-12-18T03:40:11.3103512Z inflating: .additional_ci_files/test-class-times.json 2024-12-18T03:40:11.3147549Z ##[group]Run rm artifacts.zip 2024-12-18T03:40:11.3147861Z rm artifacts.zip 2024-12-18T03:40:11.3180406Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:40:11.3180784Z env: 2024-12-18T03:40:11.3180999Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:40:11.3181320Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:40:11.3181837Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:40:11.3182317Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:40:11.3182586Z AWS_REGION: us-east-1 2024-12-18T03:40:11.3182899Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:40:11.3183265Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:40:11.3188527Z AWS_SESSION_TOKEN: *** 2024-12-18T03:40:11.3188771Z ##[endgroup] 2024-12-18T03:40:11.4727050Z ##[group]Run df -H 2024-12-18T03:40:11.4727290Z df -H 2024-12-18T03:40:11.4752652Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:40:11.4753013Z env: 2024-12-18T03:40:11.4753226Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:40:11.4753519Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:40:11.4754033Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:40:11.4754516Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:40:11.4754795Z AWS_REGION: us-east-1 2024-12-18T03:40:11.4755091Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:40:11.4755453Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:40:11.4760694Z AWS_SESSION_TOKEN: *** 2024-12-18T03:40:11.4760940Z ##[endgroup] 2024-12-18T03:40:11.4826115Z Filesystem Size Used Avail Use% Mounted on 2024-12-18T03:40:11.4826499Z tmpfs 14G 18M 14G 1% /run 2024-12-18T03:40:11.4826826Z /dev/mapper/ubuntu--vg-ubuntu--lv 1.9T 972G 819G 55% / 2024-12-18T03:40:11.4827157Z tmpfs 68G 8.2k 68G 1% /dev/shm 2024-12-18T03:40:11.4827461Z tmpfs 5.3M 0 5.3M 0% /run/lock 2024-12-18T03:40:11.4827763Z /dev/sda2 2.1G 324M 1.6G 17% /boot 2024-12-18T03:40:11.4828055Z /dev/sda1 1.2G 6.4M 1.2G 1% /boot/efi 2024-12-18T03:40:11.4828353Z tmpfs 14G 17k 14G 1% /run/user/1001 2024-12-18T03:40:11.4872363Z Prepare all required actions 2024-12-18T03:40:11.4873215Z Getting action download info 2024-12-18T03:40:11.7302080Z ##[group]Run ./.github/actions/download-td-artifacts 2024-12-18T03:40:11.7302727Z with: 2024-12-18T03:40:11.7303102Z env: 2024-12-18T03:40:11.7303502Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:40:11.7304082Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:40:11.7305115Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:40:11.7306055Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:40:11.7306578Z AWS_REGION: us-east-1 2024-12-18T03:40:11.7307151Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:40:11.7307850Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:40:11.7318081Z AWS_SESSION_TOKEN: *** 2024-12-18T03:40:11.7318557Z ##[endgroup] 2024-12-18T03:40:11.7370521Z ##[group]Run seemethere/download-artifact-s3@v4 2024-12-18T03:40:11.7371133Z with: 2024-12-18T03:40:11.7371525Z name: td_results 2024-12-18T03:40:11.7372031Z s3-bucket: gha-artifacts 2024-12-18T03:40:11.7372522Z region: us-east-1 2024-12-18T03:40:11.7372932Z env: 2024-12-18T03:40:11.7373317Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:40:11.7373862Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:40:11.7375020Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:40:11.7376421Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:40:11.7376938Z AWS_REGION: us-east-1 2024-12-18T03:40:11.7377494Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:40:11.7378187Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:40:11.7388473Z AWS_SESSION_TOKEN: *** 2024-12-18T03:40:11.7388945Z ##[endgroup] 2024-12-18T03:40:12.1393942Z (node:1230242) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2024-12-18T03:40:12.1394839Z 2024-12-18T03:40:12.1395205Z Please migrate your code to use AWS SDK for JavaScript (v3). 2024-12-18T03:40:12.1396195Z For more information, check the migration guide at https://a.co/7PzMCcy 2024-12-18T03:40:12.1397189Z (Use `node --trace-warnings ...` to show where the warning was created) 2024-12-18T03:40:12.4703787Z Found 1 objects with prefix pytorch/pytorch/12383255654/td_results/ 2024-12-18T03:40:12.4705011Z Starting download (1/1): /home/pytorchci/actions-runner/_work/pytorch/pytorch/td_results.json 2024-12-18T03:40:12.8440441Z Finished download (1/1): /home/pytorchci/actions-runner/_work/pytorch/pytorch/td_results.json 2024-12-18T03:40:12.8451370Z Artifact download has finished successfully 2024-12-18T03:40:12.8983054Z ##[group]Run mkdir -p .additional_ci_files 2024-12-18T03:40:12.8983870Z mkdir -p .additional_ci_files 2024-12-18T03:40:12.8984801Z mv td_results.json .additional_ci_files/td_results.json || true 2024-12-18T03:40:12.9043679Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:40:12.9044388Z env: 2024-12-18T03:40:12.9044813Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:40:12.9045417Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:40:12.9046440Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:40:12.9047406Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:40:12.9047938Z AWS_REGION: us-east-1 2024-12-18T03:40:12.9048544Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:40:12.9049261Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:40:12.9059801Z AWS_SESSION_TOKEN: *** 2024-12-18T03:40:12.9060290Z ##[endgroup] 2024-12-18T03:40:12.9242476Z ##[group]Run .github/scripts/parse_ref.py 2024-12-18T03:40:12.9243215Z .github/scripts/parse_ref.py 2024-12-18T03:40:12.9295981Z shell: /usr/bin/bash -e {0} 2024-12-18T03:40:12.9296527Z env: 2024-12-18T03:40:12.9296937Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:40:12.9297524Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:40:12.9298572Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:40:12.9299537Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:40:12.9300065Z AWS_REGION: us-east-1 2024-12-18T03:40:12.9300639Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:40:12.9301353Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:40:12.9311177Z AWS_SESSION_TOKEN: *** 2024-12-18T03:40:12.9311631Z ##[endgroup] 2024-12-18T03:40:12.9630471Z Prepare all required actions 2024-12-18T03:40:12.9686781Z ##[group]Run ./.github/actions/get-workflow-job-id 2024-12-18T03:40:12.9687432Z with: 2024-12-18T03:40:12.9688161Z github-token: *** 2024-12-18T03:40:12.9688618Z env: 2024-12-18T03:40:12.9689044Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:40:12.9689632Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:40:12.9690659Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:40:12.9691636Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:40:12.9692204Z AWS_REGION: us-east-1 2024-12-18T03:40:12.9692772Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:40:12.9693486Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:40:12.9704271Z AWS_SESSION_TOKEN: *** 2024-12-18T03:40:12.9704883Z ##[endgroup] 2024-12-18T03:40:12.9734871Z ##[group]Run set -eux 2024-12-18T03:40:12.9735405Z set -eux 2024-12-18T03:40:12.9736279Z python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2024-12-18T03:40:12.9788386Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:40:12.9789177Z env: 2024-12-18T03:40:12.9789586Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:40:12.9790148Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:40:12.9791118Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:40:12.9792012Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:40:12.9792524Z AWS_REGION: us-east-1 2024-12-18T03:40:12.9793090Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:40:12.9793749Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:40:12.9804367Z AWS_SESSION_TOKEN: *** 2024-12-18T03:40:12.9805062Z GITHUB_TOKEN: *** 2024-12-18T03:40:12.9805538Z ##[endgroup] 2024-12-18T03:40:12.9877735Z + python3 .github/scripts/get_workflow_job_id.py 12383255654 pytorch-rocm-hw-42 2024-12-18T03:40:14.3914368Z setting job-id=34566687110 2024-12-18T03:40:14.3915575Z setting job-name=linux-focal-rocm6.2-py3.10 / test (distributed, 2, 3, linux.rocm.gpu, module:rocm, oncall:distributed) 2024-12-18T03:40:14.4228980Z Prepare all required actions 2024-12-18T03:40:14.4229402Z Getting action download info 2024-12-18T03:40:14.6413827Z ##[group]Run ./.github/actions/filter-test-configs 2024-12-18T03:40:14.6414487Z with: 2024-12-18T03:40:14.6415441Z github-token: *** 2024-12-18T03:40:14.6417940Z test-matrix: {"include": [{"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu", "owners": ["module:rocm", "oncall:distributed"]}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu", "owners": ["module:rocm", "oncall:distributed"]}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu", "owners": ["module:rocm", "oncall:distributed"]}]} 2024-12-18T03:40:14.6421002Z job-name: linux-focal-rocm6.2-py3.10 / test (distributed, 2, 3, linux.rocm.gpu, module:rocm, oncall:distributed) 2024-12-18T03:40:14.6422031Z env: 2024-12-18T03:40:14.6422439Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:40:14.6423105Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:40:14.6424142Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:40:14.6425087Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:40:14.6425614Z AWS_REGION: us-east-1 2024-12-18T03:40:14.6426155Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:40:14.6426848Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:40:14.6437112Z AWS_SESSION_TOKEN: *** 2024-12-18T03:40:14.6437598Z ##[endgroup] 2024-12-18T03:40:14.6503439Z ##[group]Run nick-fields/retry@v3.0.0 2024-12-18T03:40:14.6504012Z with: 2024-12-18T03:40:14.6504391Z shell: bash 2024-12-18T03:40:14.6504815Z timeout_minutes: 10 2024-12-18T03:40:14.6505263Z max_attempts: 5 2024-12-18T03:40:14.6505708Z retry_wait_seconds: 30 2024-12-18T03:40:14.6507106Z command: set -eux # PyYAML 6.0 doesn't work with MacOS x86 anymore # This must run on Python-3.7 (AmazonLinux2) so can't use request=3.32.2 python3 -m pip install requests==2.27.1 pyyaml==6.0.1 2024-12-18T03:40:14.6508560Z polling_interval_seconds: 1 2024-12-18T03:40:14.6509081Z warning_on_retry: true 2024-12-18T03:40:14.6509566Z continue_on_error: false 2024-12-18T03:40:14.6510044Z env: 2024-12-18T03:40:14.6510445Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:40:14.6511015Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:40:14.6512004Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:40:14.6512957Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:40:14.6513478Z AWS_REGION: us-east-1 2024-12-18T03:40:14.6514045Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:40:14.6514749Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:40:14.6525057Z AWS_SESSION_TOKEN: *** 2024-12-18T03:40:14.6525763Z GITHUB_TOKEN: *** 2024-12-18T03:40:14.6526219Z ##[endgroup] 2024-12-18T03:40:14.7227125Z + python3 -m pip install requests==2.27.1 pyyaml==6.0.1 2024-12-18T03:40:14.9727291Z Defaulting to user installation because normal site-packages is not writeable 2024-12-18T03:40:15.0499255Z Requirement already satisfied: requests==2.27.1 in /home/pytorchci/.local/lib/python3.10/site-packages (2.27.1) 2024-12-18T03:40:15.0502954Z Requirement already satisfied: pyyaml==6.0.1 in /home/pytorchci/.local/lib/python3.10/site-packages (6.0.1) 2024-12-18T03:40:15.0591280Z Requirement already satisfied: certifi>=2017.4.17 in /usr/lib/python3/dist-packages (from requests==2.27.1) (2020.6.20) 2024-12-18T03:40:15.0596978Z Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/lib/python3/dist-packages (from requests==2.27.1) (1.26.5) 2024-12-18T03:40:15.0604735Z Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3/dist-packages (from requests==2.27.1) (3.3) 2024-12-18T03:40:15.0614399Z Requirement already satisfied: charset-normalizer~=2.0.0 in /home/pytorchci/.local/lib/python3.10/site-packages (from requests==2.27.1) (2.0.12) 2024-12-18T03:40:15.7230258Z Command completed after 1 attempt(s). 2024-12-18T03:40:15.7344358Z ##[group]Run set -x 2024-12-18T03:40:15.7345339Z set -x 2024-12-18T03:40:15.7345814Z  2024-12-18T03:40:15.7346530Z # Use relative path here as this could be checked out anywhere, not necessarily 2024-12-18T03:40:15.7347416Z # in runner workspace 2024-12-18T03:40:15.7348133Z python3 "${GITHUB_ACTION_PATH}/../../scripts/parse_ref.py" 2024-12-18T03:40:15.7405602Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:40:15.7406333Z env: 2024-12-18T03:40:15.7406750Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:40:15.7407326Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:40:15.7408339Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:40:15.7409285Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:40:15.7409813Z AWS_REGION: us-east-1 2024-12-18T03:40:15.7410397Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:40:15.7411114Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:40:15.7421476Z AWS_SESSION_TOKEN: *** 2024-12-18T03:40:15.7421979Z ##[endgroup] 2024-12-18T03:40:15.7509190Z + python3 /home/pytorchci/actions-runner/_work/pytorch/pytorch/./.github/actions/filter-test-configs/../../scripts/parse_ref.py 2024-12-18T03:40:15.7749149Z ##[group]Run echo "Workflow: ${GITHUB_WORKFLOW}" 2024-12-18T03:40:15.7749976Z echo "Workflow: ${GITHUB_WORKFLOW}" 2024-12-18T03:40:15.7750603Z echo "Job name: ${JOB_NAME}" 2024-12-18T03:40:15.7751157Z  2024-12-18T03:40:15.7751865Z # Use relative path here as this could be checked out anywhere, not necessarily 2024-12-18T03:40:15.7752722Z # in runner workspace 2024-12-18T03:40:15.7753508Z python3 "${GITHUB_ACTION_PATH}/../../scripts/filter_test_configs.py" \ 2024-12-18T03:40:15.7754390Z  --workflow "${GITHUB_WORKFLOW}" \ 2024-12-18T03:40:15.7755105Z  --job-name "${JOB_NAME}" \ 2024-12-18T03:40:15.7758229Z  --test-matrix "{"include": [{"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu", "owners": ["module:rocm", "oncall:distributed"]}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu", "owners": ["module:rocm", "oncall:distributed"]}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu", "owners": ["module:rocm", "oncall:distributed"]}]}" \ 2024-12-18T03:40:15.7760962Z  --selected-test-configs "" \ 2024-12-18T03:40:15.7761573Z  --pr-number "${PR_NUMBER}" \ 2024-12-18T03:40:15.7762161Z  --tag "${TAG}" \ 2024-12-18T03:40:15.7762708Z  --event-name "${EVENT_NAME}" \ 2024-12-18T03:40:15.7763314Z  --schedule "${SCHEDULE}" \ 2024-12-18T03:40:15.7763924Z  --branch "${HEAD_BRANCH}" 2024-12-18T03:40:15.7814960Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:40:15.7815689Z env: 2024-12-18T03:40:15.7816136Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:40:15.7816730Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:40:15.7818139Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:40:15.7819092Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:40:15.7819627Z AWS_REGION: us-east-1 2024-12-18T03:40:15.7820210Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:40:15.7820908Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:40:15.7832182Z AWS_SESSION_TOKEN: *** 2024-12-18T03:40:15.7832904Z GITHUB_TOKEN: *** 2024-12-18T03:40:15.7833847Z JOB_NAME: linux-focal-rocm6.2-py3.10 / test (distributed, 2, 3, linux.rocm.gpu, module:rocm, oncall:distributed) 2024-12-18T03:40:15.7834860Z PR_NUMBER: 2024-12-18T03:40:15.7835350Z TAG: 2024-12-18T03:40:15.7835835Z EVENT_NAME: push 2024-12-18T03:40:15.7836367Z SCHEDULE: 2024-12-18T03:40:15.7836855Z HEAD_BRANCH: 2024-12-18T03:40:15.7837368Z ##[endgroup] 2024-12-18T03:40:15.7917257Z Workflow: periodic 2024-12-18T03:40:15.7918368Z Job name: linux-focal-rocm6.2-py3.10 / test (distributed, 2, 3, linux.rocm.gpu, module:rocm, oncall:distributed) 2024-12-18T03:40:16.4024022Z ##[group]Run echo "Filtered matrix:" 2024-12-18T03:40:16.4024698Z echo "Filtered matrix:" 2024-12-18T03:40:16.4027217Z echo "{"include": [{"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu", "owners": ["module:rocm", "oncall:distributed"]}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu", "owners": ["module:rocm", "oncall:distributed"]}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu", "owners": ["module:rocm", "oncall:distributed"]}]}" 2024-12-18T03:40:16.4029689Z  2024-12-18T03:40:16.4030144Z echo 2024-12-18T03:40:16.4030739Z echo "Is the current job unstable? False" 2024-12-18T03:40:16.4031448Z  2024-12-18T03:40:16.4031877Z echo 2024-12-18T03:40:16.4032425Z echo "Is keep-going label set? False" 2024-12-18T03:40:16.4033102Z  2024-12-18T03:40:16.4033558Z echo 2024-12-18T03:40:16.4034072Z echo "Renabled issues? " 2024-12-18T03:40:16.4085476Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:40:16.4086181Z env: 2024-12-18T03:40:16.4086590Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:40:16.4087163Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:40:16.4088168Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:40:16.4089118Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:40:16.4089640Z AWS_REGION: us-east-1 2024-12-18T03:40:16.4090317Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:40:16.4091012Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:40:16.4101348Z AWS_SESSION_TOKEN: *** 2024-12-18T03:40:16.4101827Z ##[endgroup] 2024-12-18T03:40:16.4178013Z Filtered matrix: 2024-12-18T03:40:16.4180547Z {include: [{config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu, owners: [module:rocm, oncall:distributed]}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu, owners: [module:rocm, oncall:distributed]}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu, owners: [module:rocm, oncall:distributed]}]} 2024-12-18T03:40:16.4182947Z 2024-12-18T03:40:16.4183251Z Is the current job unstable? False 2024-12-18T03:40:16.4183646Z 2024-12-18T03:40:16.4183903Z Is keep-going label set? False 2024-12-18T03:40:16.4184321Z 2024-12-18T03:40:16.4184536Z Renabled issues? 2024-12-18T03:40:16.4237692Z ##[group]Run echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2024-12-18T03:40:16.4238882Z echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2024-12-18T03:40:16.4289921Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T03:40:16.4290622Z env: 2024-12-18T03:40:16.4291027Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:40:16.4291609Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:40:16.4292627Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:40:16.4293972Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:40:16.4294490Z AWS_REGION: us-east-1 2024-12-18T03:40:16.4295270Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:40:16.4295967Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:40:16.4306653Z AWS_SESSION_TOKEN: *** 2024-12-18T03:40:16.4307230Z JOB_TIMEOUT: 300 2024-12-18T03:40:16.4307736Z ##[endgroup] 2024-12-18T03:40:16.4462560Z ##[group]Run set -x 2024-12-18T03:40:16.4463164Z set -x 2024-12-18T03:40:16.4463658Z  2024-12-18T03:40:16.4464234Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2024-12-18T03:40:16.4465106Z  TEST_COMMAND=.ci/pytorch/multigpu-test.sh 2024-12-18T03:40:16.4465960Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2024-12-18T03:40:16.4466743Z  TEST_COMMAND=.ci/caffe2/test.sh 2024-12-18T03:40:16.4467409Z else 2024-12-18T03:40:16.4467960Z  TEST_COMMAND=.ci/pytorch/test.sh 2024-12-18T03:40:16.4468613Z fi 2024-12-18T03:40:16.4469098Z  2024-12-18T03:40:16.4469814Z # detached container should get cleaned up by teardown_ec2_linux 2024-12-18T03:40:16.4470790Z # TODO: Stop building test binaries as part of the build phase 2024-12-18T03:40:16.4471635Z # Used for GPU_FLAG since that doesn't play nice 2024-12-18T03:40:16.4472389Z # shellcheck disable=SC2086,SC2090 2024-12-18T03:40:16.4472999Z container_name=$(docker run \ 2024-12-18T03:40:16.4473566Z  ${GPU_FLAG:-} \ 2024-12-18T03:40:16.4474081Z  -e BUILD_ENVIRONMENT \ 2024-12-18T03:40:16.4474622Z  -e PR_NUMBER \ 2024-12-18T03:40:16.4475146Z  -e GITHUB_ACTIONS \ 2024-12-18T03:40:16.4475768Z  -e GITHUB_REPOSITORY \ 2024-12-18T03:40:16.4476398Z  -e GITHUB_WORKFLOW \ 2024-12-18T03:40:16.4477019Z  -e GITHUB_JOB \ 2024-12-18T03:40:16.4477602Z  -e GITHUB_RUN_ID \ 2024-12-18T03:40:16.4478146Z  -e GITHUB_RUN_NUMBER \ 2024-12-18T03:40:16.4478696Z  -e GITHUB_RUN_ATTEMPT \ 2024-12-18T03:40:16.4479228Z  -e JOB_ID \ 2024-12-18T03:40:16.4479690Z  -e JOB_NAME \ 2024-12-18T03:40:16.4480159Z  -e BRANCH \ 2024-12-18T03:40:16.4480609Z  -e SHA1 \ 2024-12-18T03:40:16.4481069Z  -e AWS_DEFAULT_REGION \ 2024-12-18T03:40:16.4481609Z  -e IN_WHEEL_TEST \ 2024-12-18T03:40:16.4482112Z  -e SHARD_NUMBER \ 2024-12-18T03:40:16.4482613Z  -e TEST_CONFIG \ 2024-12-18T03:40:16.4483107Z  -e NUM_TEST_SHARDS \ 2024-12-18T03:40:16.4483631Z  -e REENABLED_ISSUES \ 2024-12-18T03:40:16.4484183Z  -e CONTINUE_THROUGH_ERROR \ 2024-12-18T03:40:16.4484756Z  -e VERBOSE_TEST_LOGS \ 2024-12-18T03:40:16.4485281Z  -e TEST_SHOWLOCALS \ 2024-12-18T03:40:16.4485792Z  -e NO_TEST_TIMEOUT \ 2024-12-18T03:40:16.4486289Z  -e NO_TD \ 2024-12-18T03:40:16.4486800Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2024-12-18T03:40:16.4487410Z  -e SCCACHE_BUCKET \ 2024-12-18T03:40:16.4487962Z  -e XLA_CLANG_CACHE_S3_BUCKET_NAME \ 2024-12-18T03:40:16.4488600Z  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \ 2024-12-18T03:40:16.4489263Z  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \ 2024-12-18T03:40:16.4489885Z  -e TESTS_TO_INCLUDE \ 2024-12-18T03:40:16.4490513Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2024-12-18T03:40:16.4491214Z  --ulimit stack=10485760:83886080 \ 2024-12-18T03:40:16.4491788Z  --ulimit core=0 \ 2024-12-18T03:40:16.4492353Z  --security-opt seccomp=unconfined \ 2024-12-18T03:40:16.4492970Z  --cap-add=SYS_PTRACE \ 2024-12-18T03:40:16.4493498Z  --shm-size="8g" \ 2024-12-18T03:40:16.4493976Z  --tty \ 2024-12-18T03:40:16.4494411Z  --detach \ 2024-12-18T03:40:16.4495025Z  --name="${container_name}" \ 2024-12-18T03:40:16.4496003Z  --user jenkins \ 2024-12-18T03:40:16.4496626Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2024-12-18T03:40:16.4497333Z  -w /var/lib/jenkins/workspace \ 2024-12-18T03:40:16.4497905Z  "${DOCKER_IMAGE}" 2024-12-18T03:40:16.4498373Z ) 2024-12-18T03:40:16.4498815Z # save container name for later step 2024-12-18T03:40:16.4499907Z echo "CONTAINER_NAME=${container_name}" >> "$GITHUB_ENV" 2024-12-18T03:40:16.4501255Z # jenkins user does not have write permission to mounted workspace; work-around by copying within container to jenkins home 2024-12-18T03:40:16.4502955Z docker exec -t "${container_name}" sh -c "cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && ${TEST_COMMAND}" 2024-12-18T03:40:16.4562603Z shell: /usr/bin/bash -e {0} 2024-12-18T03:40:16.4563121Z env: 2024-12-18T03:40:16.4563522Z GIT_DEFAULT_BRANCH: main 2024-12-18T03:40:16.4564086Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T03:40:16.4565113Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T03:40:16.4566054Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T03:40:16.4566573Z AWS_REGION: us-east-1 2024-12-18T03:40:16.4567141Z AWS_ACCESS_KEY_ID: *** 2024-12-18T03:40:16.4567838Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T03:40:16.4578155Z AWS_SESSION_TOKEN: *** 2024-12-18T03:40:16.4578729Z BUILD_ENVIRONMENT: linux-focal-rocm6.2-py3.10 2024-12-18T03:40:16.4579341Z PR_NUMBER: 2024-12-18T03:40:16.4579786Z GITHUB_REPOSITORY: pytorch/pytorch 2024-12-18T03:40:16.4580347Z GITHUB_WORKFLOW: periodic 2024-12-18T03:40:16.4580830Z GITHUB_JOB: test 2024-12-18T03:40:16.4581263Z GITHUB_RUN_ID: 12383255654 2024-12-18T03:40:16.4581755Z GITHUB_RUN_NUMBER: 15427 2024-12-18T03:40:16.4582233Z GITHUB_RUN_ATTEMPT: 1 2024-12-18T03:40:16.4582679Z JOB_ID: 34566687110 2024-12-18T03:40:16.4583599Z JOB_NAME: linux-focal-rocm6.2-py3.10 / test (distributed, 2, 3, linux.rocm.gpu, module:rocm, oncall:distributed) 2024-12-18T03:40:16.4584770Z BRANCH: release/2.6 2024-12-18T03:40:16.4585377Z SHA1: 0cdf8b1d09254cfda66191d1bd01e3041c3c76f7 2024-12-18T03:40:16.4586111Z CONTINUE_THROUGH_ERROR: False 2024-12-18T03:40:16.4586720Z VERBOSE_TEST_LOGS: False 2024-12-18T03:40:16.4587290Z TEST_SHOWLOCALS: False 2024-12-18T03:40:16.4587849Z NO_TEST_TIMEOUT: False 2024-12-18T03:40:16.4588370Z NO_TD: False 2024-12-18T03:40:16.4588872Z TEST_CONFIG: distributed 2024-12-18T03:40:16.4589420Z SHARD_NUMBER: 2 2024-12-18T03:40:16.4589893Z NUM_TEST_SHARDS: 3 2024-12-18T03:40:16.4590327Z REENABLED_ISSUES: 2024-12-18T03:40:16.4590871Z SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2 2024-12-18T03:40:16.4592200Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-rocm-n-py3:45e1356b47a284893081276eff3000b7b534f3b1 2024-12-18T03:40:16.4593662Z XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla 2024-12-18T03:40:16.4594473Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 0 2024-12-18T03:40:16.4595082Z PYTORCH_TEST_RERUN_DISABLED_TESTS: 0 2024-12-18T03:40:16.4595731Z TESTS_TO_INCLUDE: 2024-12-18T03:40:16.4596240Z ##[endgroup] 2024-12-18T03:40:16.4673645Z + [[ distributed == \m\u\l\t\i\g\p\u ]] 2024-12-18T03:40:16.4674323Z + [[ linux-focal-rocm6.2-py3.10 == *onnx* ]] 2024-12-18T03:40:16.4674967Z + TEST_COMMAND=.ci/pytorch/test.sh 2024-12-18T03:40:16.4686380Z +++ nproc --ignore=2 2024-12-18T03:40:16.4712719Z ++ docker run --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e GITHUB_REPOSITORY -e GITHUB_WORKFLOW -e GITHUB_JOB -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e JOB_ID -e JOB_NAME -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e REENABLED_ISSUES -e CONTINUE_THROUGH_ERROR -e VERBOSE_TEST_LOGS -e TEST_SHOWLOCALS -e NO_TEST_TIMEOUT -e NO_TD -e MAX_JOBS=62 -e SCCACHE_BUCKET -e XLA_CLANG_CACHE_S3_BUCKET_NAME -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS -e TESTS_TO_INCLUDE --env-file=/tmp/github_env_12383255654 --ulimit stack=10485760:83886080 --ulimit core=0 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --shm-size=8g --tty --detach --name= --user jenkins -v /home/pytorchci/actions-runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-rocm-n-py3:45e1356b47a284893081276eff3000b7b534f3b1 2024-12-18T03:40:19.6348937Z + container_name=a195586eb8a191d75ed1195cd70100d037bba4a1d97216b76c17cc218bc57f83 2024-12-18T03:40:19.6350474Z + echo CONTAINER_NAME=a195586eb8a191d75ed1195cd70100d037bba4a1d97216b76c17cc218bc57f83 2024-12-18T03:40:19.6352694Z + docker exec -t a195586eb8a191d75ed1195cd70100d037bba4a1d97216b76c17cc218bc57f83 sh -c 'cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && .ci/pytorch/test.sh' 2024-12-18T03:40:34.7927634Z Processing ./dist/torch-2.6.0a0+git0cdf8b1-cp310-cp310-linux_x86_64.whl 2024-12-18T03:40:35.2438939Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.6.0a0+git0cdf8b1) (3.16.1) 2024-12-18T03:40:35.2441192Z Requirement already satisfied: typing-extensions>=4.10.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.6.0a0+git0cdf8b1) (4.12.2) 2024-12-18T03:40:35.2443154Z Requirement already satisfied: networkx in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.6.0a0+git0cdf8b1) (2.8.8) 2024-12-18T03:40:35.2445391Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.6.0a0+git0cdf8b1) (3.1.4) 2024-12-18T03:40:35.2447199Z Requirement already satisfied: fsspec in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.6.0a0+git0cdf8b1) (2024.10.0) 2024-12-18T03:40:35.2450870Z Requirement already satisfied: sympy==1.13.1 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.6.0a0+git0cdf8b1) (1.13.1) 2024-12-18T03:40:35.2517962Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy==1.13.1->torch==2.6.0a0+git0cdf8b1) (1.3.0) 2024-12-18T03:40:35.2955314Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch==2.6.0a0+git0cdf8b1) (3.0.2) 2024-12-18T03:40:36.1545098Z Installing collected packages: torch 2024-12-18T03:40:44.5660084Z Successfully installed torch-2.6.0a0+git0cdf8b1 2024-12-18T03:40:44.6193386Z + export TERM=vt100 2024-12-18T03:40:44.6193921Z + TERM=vt100 2024-12-18T03:40:44.6195157Z ++ dirname .ci/pytorch/test.sh 2024-12-18T03:40:44.6213643Z + source .ci/pytorch/common.sh 2024-12-18T03:40:44.6222620Z +++ dirname .ci/pytorch/common.sh 2024-12-18T03:40:44.6235821Z ++ source .ci/pytorch/common_utils.sh 2024-12-18T03:40:44.6236508Z +++ declare -f -t trap_add 2024-12-18T03:40:44.6242863Z ++ set -ex 2024-12-18T03:40:44.6243474Z ++ [[ linux-focal-rocm6.2-py3.10 == *rocm* ]] 2024-12-18T03:40:44.6244168Z ++ unset HIP_PLATFORM 2024-12-18T03:40:44.6244680Z ++ export PYTORCH_TEST_WITH_ROCM=1 2024-12-18T03:40:44.6245232Z ++ PYTORCH_TEST_WITH_ROCM=1 2024-12-18T03:40:44.6245740Z ++ export HSAKMT_DEBUG_LEVEL=4 2024-12-18T03:40:44.6246239Z ++ HSAKMT_DEBUG_LEVEL=4 2024-12-18T03:40:44.6246766Z ++ export HSA_FORCE_FINE_GRAIN_PCIE=1 2024-12-18T03:40:44.6247346Z ++ HSA_FORCE_FINE_GRAIN_PCIE=1 2024-12-18T03:40:44.6247853Z ++ BUILD_TEST_LIBTORCH=0 2024-12-18T03:40:44.6248384Z + [[ linux-focal-rocm6.2-py3.10 != *rocm* ]] 2024-12-18T03:40:44.6248994Z + echo 'Environment variables:' 2024-12-18T03:40:44.6249515Z Environment variables: 2024-12-18T03:40:44.6249953Z + env 2024-12-18T03:40:44.6259182Z INSTALLED_DB=yes 2024-12-18T03:40:44.6260083Z GITHUB_WORKSPACE=/home/pytorchci/actions-runner/_work/pytorch/pytorch 2024-12-18T03:40:44.6260987Z AOTRITON_INSTALLED_PREFIX=/opt/rocm/aotriton 2024-12-18T03:40:44.6262391Z CONTINUE_THROUGH_ERROR=False 2024-12-18T03:40:44.6262988Z BUILD_ENVIRONMENT=linux-focal-rocm6.2-py3.10 2024-12-18T03:40:44.6263594Z HOSTNAME=a195586eb8a1 2024-12-18T03:40:44.6264653Z GITHUB_PATH=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/add_path_d712ed0b-2726-4418-ba06-82fbe2ae25f1 2024-12-18T03:40:44.6265765Z GITHUB_ACTION=__self 2024-12-18T03:40:44.6266605Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2024-12-18T03:40:44.6267155Z GITHUB_RUN_NUMBER=15427 2024-12-18T03:40:44.6267612Z TEST_CONFIG=distributed 2024-12-18T03:40:44.6268087Z GITHUB_REPOSITORY_OWNER_ID=21003710 2024-12-18T03:40:44.6268644Z AWS_DEFAULT_REGION=us-east-1 2024-12-18T03:40:44.6269150Z GITHUB_TRIGGERING_ACTOR=malfet 2024-12-18T03:40:44.6269656Z GITHUB_REF_TYPE=branch 2024-12-18T03:40:44.6273732Z *** 2024-12-18T03:40:44.6274182Z GITHUB_REPOSITORY_ID=65600975 2024-12-18T03:40:44.6274716Z GITHUB_ACTIONS=true 2024-12-18T03:40:44.6275238Z SHA1=0cdf8b1d09254cfda66191d1bd01e3041c3c76f7 2024-12-18T03:40:44.6275945Z GITHUB_SHA=0cdf8b1d09254cfda66191d1bd01e3041c3c76f7 2024-12-18T03:40:44.6276965Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/periodic.yml@refs/heads/release/2.6 2024-12-18T03:40:44.6277885Z VERBOSE_TEST_LOGS=False 2024-12-18T03:40:44.6278373Z GITHUB_REF=refs/heads/release/2.6 2024-12-18T03:40:44.6278886Z SHARD_NUMBER=2 2024-12-18T03:40:44.6279326Z GITHUB_REF_PROTECTED=true 2024-12-18T03:40:44.6279833Z HOME=/var/lib/jenkins 2024-12-18T03:40:44.6280347Z GITHUB_API_URL=https://api.github.com 2024-12-18T03:40:44.6280953Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2024-12-18T03:40:44.6281484Z LANG=C.UTF-8 2024-12-18T03:40:44.6281964Z PYTORCH_TEST_WITH_ROCM=1 2024-12-18T03:40:44.6282434Z NUM_TEST_SHARDS=3 2024-12-18T03:40:44.6283214Z GITHUB_STATE=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/save_state_d712ed0b-2726-4418-ba06-82fbe2ae25f1 2024-12-18T03:40:44.6284048Z JOB_NAME=linux-focal-rocm6.2-py3.10 / test (distributed, 2, 3, linux.rocm.gpu, module:rocm, oncall:distributed) 2024-12-18T03:40:44.6284560Z MAGMA_HOME=/opt/rocm/magma 2024-12-18T03:40:44.6285094Z GITHUB_ENV=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/set_env_d712ed0b-2726-4418-ba06-82fbe2ae25f1 2024-12-18T03:40:44.6285649Z HSAKMT_DEBUG_LEVEL=4 2024-12-18T03:40:44.6286056Z GITHUB_EVENT_PATH=/home/pytorchci/actions-runner/_work/_temp/_github_workflow/event.json 2024-12-18T03:40:44.6286519Z GITHUB_EVENT_NAME=push 2024-12-18T03:40:44.6286760Z GITHUB_RUN_ID=12383255654 2024-12-18T03:40:44.6287333Z GITHUB_STEP_SUMMARY=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/step_summary_d712ed0b-2726-4418-ba06-82fbe2ae25f1 2024-12-18T03:40:44.6287943Z GITHUB_ACTOR=malfet 2024-12-18T03:40:44.6288163Z PR_NUMBER= 2024-12-18T03:40:44.6288376Z GITHUB_RUN_ATTEMPT=1 2024-12-18T03:40:44.6288617Z ANACONDA_PYTHON_VERSION=3.10 2024-12-18T03:40:44.6288928Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2024-12-18T03:40:44.6289250Z TERM=vt100 2024-12-18T03:40:44.6289452Z INSTALLED_VISION=yes 2024-12-18T03:40:44.6289685Z BRANCH=release/2.6 2024-12-18T03:40:44.6289904Z TESTS_TO_INCLUDE= 2024-12-18T03:40:44.6290359Z GITHUB_ACTION_PATH=/home/pytorchci/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-rocm 2024-12-18T03:40:44.6290880Z GITHUB_SERVER_URL=https://github.com 2024-12-18T03:40:44.6291168Z PYTORCH_ROCM_ARCH=gfx90a 2024-12-18T03:40:44.6291406Z REENABLED_ISSUES= 2024-12-18T03:40:44.6291648Z SHLVL=1 2024-12-18T03:40:44.6291845Z MAX_JOBS=62 2024-12-18T03:40:44.6292048Z GITHUB_ACTOR_ID=2453524 2024-12-18T03:40:44.6292360Z GITHUB_WORKFLOW_SHA=0cdf8b1d09254cfda66191d1bd01e3041c3c76f7 2024-12-18T03:40:44.6292713Z GITHUB_REF_NAME=release/2.6 2024-12-18T03:40:44.6292962Z ROCM_PATH=/opt/rocm 2024-12-18T03:40:44.6293305Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2024-12-18T03:40:44.6293692Z GITHUB_JOB=test 2024-12-18T03:40:44.6293903Z NO_TEST_TIMEOUT=False 2024-12-18T03:40:44.6294152Z GITHUB_REPOSITORY=pytorch/pytorch 2024-12-18T03:40:44.6294414Z LC_ALL=C.UTF-8 2024-12-18T03:40:44.6294700Z GITHUB_RETENTION_DAYS=90 2024-12-18T03:40:44.6295154Z GITHUB_ACTION_REPOSITORY= 2024-12-18T03:40:44.6296010Z PATH=/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2024-12-18T03:40:44.6296877Z GITHUB_BASE_REF= 2024-12-18T03:40:44.6297233Z CI=true 2024-12-18T03:40:44.6297441Z HSA_FORCE_FINE_GRAIN_PCIE=1 2024-12-18T03:40:44.6297708Z GITHUB_REPOSITORY_OWNER=pytorch 2024-12-18T03:40:44.6297962Z JOB_ID=34566687110 2024-12-18T03:40:44.6298180Z INSTALLED_PROTOBUF=yes 2024-12-18T03:40:44.6298412Z GITHUB_HEAD_REF= 2024-12-18T03:40:44.6298625Z GITHUB_ACTION_REF= 2024-12-18T03:40:44.6298892Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2024-12-18T03:40:44.6299228Z TEST_SHOWLOCALS=False 2024-12-18T03:40:44.6299474Z GITHUB_WORKFLOW=periodic 2024-12-18T03:40:44.6299727Z DEBIAN_FRONTEND=noninteractive 2024-12-18T03:40:44.6300308Z GITHUB_OUTPUT=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/set_output_d712ed0b-2726-4418-ba06-82fbe2ae25f1 2024-12-18T03:40:44.6300886Z NO_TD=False 2024-12-18T03:40:44.6301094Z OLDPWD=/var/lib/jenkins 2024-12-18T03:40:44.6301323Z _=/usr/bin/env 2024-12-18T03:40:44.6301620Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2024-12-18T03:40:44.6454990Z + TORCH_INSTALL_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch 2024-12-18T03:40:44.6456169Z + TORCH_BIN_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-12-18T03:40:44.6457200Z + TORCH_LIB_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib 2024-12-18T03:40:44.6458248Z + TORCH_TEST_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/test 2024-12-18T03:40:44.6459028Z + BUILD_DIR=build 2024-12-18T03:40:44.6459486Z + BUILD_RENAMED_DIR=build_renamed 2024-12-18T03:40:44.6460029Z + BUILD_BIN_DIR=build/bin 2024-12-18T03:40:44.6460505Z + SHARD_NUMBER=2 2024-12-18T03:40:44.6460927Z + NUM_TEST_SHARDS=3 2024-12-18T03:40:44.6461370Z + export VALGRIND=ON 2024-12-18T03:40:44.6461808Z + VALGRIND=ON 2024-12-18T03:40:44.6462365Z + [[ linux-focal-rocm6.2-py3.10 == *clang9* ]] 2024-12-18T03:40:44.6463124Z + [[ linux-focal-rocm6.2-py3.10 == *xpu* ]] 2024-12-18T03:40:44.6463761Z + [[ 0 == \1 ]] 2024-12-18T03:40:44.6464230Z + [[ False == \1 ]] 2024-12-18T03:40:44.6464797Z + [[ linux-focal-rocm6.2-py3.10 != *bazel* ]] 2024-12-18T03:40:44.6465531Z ++ realpath build/custom_test_artifacts 2024-12-18T03:40:44.6476234Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/pytorch/build/custom_test_artifacts 2024-12-18T03:40:44.6477196Z + [[ -n '' ]] 2024-12-18T03:40:44.6477651Z + echo 'Environment variables' 2024-12-18T03:40:44.6478177Z Environment variables 2024-12-18T03:40:44.6501458Z + env 2024-12-18T03:40:44.6502209Z INSTALLED_DB=yes 2024-12-18T03:40:44.6502966Z GITHUB_WORKSPACE=/home/pytorchci/actions-runner/_work/pytorch/pytorch 2024-12-18T03:40:44.6503503Z AOTRITON_INSTALLED_PREFIX=/opt/rocm/aotriton 2024-12-18T03:40:44.6503824Z CONTINUE_THROUGH_ERROR=False 2024-12-18T03:40:44.6504137Z BUILD_ENVIRONMENT=linux-focal-rocm6.2-py3.10 2024-12-18T03:40:44.6504448Z HOSTNAME=a195586eb8a1 2024-12-18T03:40:44.6504998Z GITHUB_PATH=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/add_path_d712ed0b-2726-4418-ba06-82fbe2ae25f1 2024-12-18T03:40:44.6505685Z GITHUB_ACTION=__self 2024-12-18T03:40:44.6506036Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2024-12-18T03:40:44.6506375Z GITHUB_RUN_NUMBER=15427 2024-12-18T03:40:44.6506664Z TEST_CONFIG=distributed 2024-12-18T03:40:44.6506962Z GITHUB_REPOSITORY_OWNER_ID=21003710 2024-12-18T03:40:44.6507290Z AWS_DEFAULT_REGION=us-east-1 2024-12-18T03:40:44.6507559Z GITHUB_TRIGGERING_ACTOR=malfet 2024-12-18T03:40:44.6507823Z GITHUB_REF_TYPE=branch 2024-12-18T03:40:44.6508111Z *** 2024-12-18T03:40:44.6508326Z GITHUB_REPOSITORY_ID=65600975 2024-12-18T03:40:44.6508599Z GITHUB_ACTIONS=true 2024-12-18T03:40:44.6508863Z SHA1=0cdf8b1d09254cfda66191d1bd01e3041c3c76f7 2024-12-18T03:40:44.6509226Z GITHUB_SHA=0cdf8b1d09254cfda66191d1bd01e3041c3c76f7 2024-12-18T03:40:44.6509995Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/periodic.yml@refs/heads/release/2.6 2024-12-18T03:40:44.6510467Z VERBOSE_TEST_LOGS=False 2024-12-18T03:40:44.6510722Z GITHUB_REF=refs/heads/release/2.6 2024-12-18T03:40:44.6510991Z SHARD_NUMBER=2 2024-12-18T03:40:44.6511226Z GITHUB_REF_PROTECTED=true 2024-12-18T03:40:44.6511623Z HOME=/var/lib/jenkins 2024-12-18T03:40:44.6511892Z GITHUB_API_URL=https://api.github.com 2024-12-18T03:40:44.6512207Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2024-12-18T03:40:44.6512484Z LANG=C.UTF-8 2024-12-18T03:40:44.6512704Z PYTORCH_TEST_WITH_ROCM=1 2024-12-18T03:40:44.6512953Z NUM_TEST_SHARDS=3 2024-12-18T03:40:44.6513497Z GITHUB_STATE=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/save_state_d712ed0b-2726-4418-ba06-82fbe2ae25f1 2024-12-18T03:40:44.6514312Z JOB_NAME=linux-focal-rocm6.2-py3.10 / test (distributed, 2, 3, linux.rocm.gpu, module:rocm, oncall:distributed) 2024-12-18T03:40:44.6514829Z MAGMA_HOME=/opt/rocm/magma 2024-12-18T03:40:44.6515384Z GITHUB_ENV=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/set_env_d712ed0b-2726-4418-ba06-82fbe2ae25f1 2024-12-18T03:40:44.6515951Z HSAKMT_DEBUG_LEVEL=4 2024-12-18T03:40:44.6516363Z GITHUB_EVENT_PATH=/home/pytorchci/actions-runner/_work/_temp/_github_workflow/event.json 2024-12-18T03:40:44.6516820Z GITHUB_EVENT_NAME=push 2024-12-18T03:40:44.6517066Z GITHUB_RUN_ID=12383255654 2024-12-18T03:40:44.6517643Z GITHUB_STEP_SUMMARY=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/step_summary_d712ed0b-2726-4418-ba06-82fbe2ae25f1 2024-12-18T03:40:44.6518252Z GITHUB_ACTOR=malfet 2024-12-18T03:40:44.6518475Z PR_NUMBER= 2024-12-18T03:40:44.6518684Z GITHUB_RUN_ATTEMPT=1 2024-12-18T03:40:44.6518911Z VALGRIND=ON 2024-12-18T03:40:44.6519134Z ANACONDA_PYTHON_VERSION=3.10 2024-12-18T03:40:44.6519446Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2024-12-18T03:40:44.6519765Z TERM=vt100 2024-12-18T03:40:44.6519971Z INSTALLED_VISION=yes 2024-12-18T03:40:44.6520209Z BRANCH=release/2.6 2024-12-18T03:40:44.6520429Z TESTS_TO_INCLUDE= 2024-12-18T03:40:44.6520877Z GITHUB_ACTION_PATH=/home/pytorchci/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-rocm 2024-12-18T03:40:44.6521414Z GITHUB_SERVER_URL=https://github.com 2024-12-18T03:40:44.6521708Z PYTORCH_ROCM_ARCH=gfx90a 2024-12-18T03:40:44.6521958Z REENABLED_ISSUES= 2024-12-18T03:40:44.6522208Z SHLVL=1 2024-12-18T03:40:44.6522435Z MAX_JOBS=62 2024-12-18T03:40:44.6522680Z GITHUB_ACTOR_ID=2453524 2024-12-18T03:40:44.6523049Z GITHUB_WORKFLOW_SHA=0cdf8b1d09254cfda66191d1bd01e3041c3c76f7 2024-12-18T03:40:44.6523475Z GITHUB_REF_NAME=release/2.6 2024-12-18T03:40:44.6523765Z ROCM_PATH=/opt/rocm 2024-12-18T03:40:44.6524120Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2024-12-18T03:40:44.6524514Z GITHUB_JOB=test 2024-12-18T03:40:44.6524732Z NO_TEST_TIMEOUT=False 2024-12-18T03:40:44.6524979Z GITHUB_REPOSITORY=pytorch/pytorch 2024-12-18T03:40:44.6525245Z LC_ALL=C.UTF-8 2024-12-18T03:40:44.6525468Z GITHUB_RETENTION_DAYS=90 2024-12-18T03:40:44.6525717Z GITHUB_ACTION_REPOSITORY= 2024-12-18T03:40:44.6526570Z PATH=/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2024-12-18T03:40:44.6527427Z GITHUB_BASE_REF= 2024-12-18T03:40:44.6527641Z CI=true 2024-12-18T03:40:44.6527846Z HSA_FORCE_FINE_GRAIN_PCIE=1 2024-12-18T03:40:44.6528108Z GITHUB_REPOSITORY_OWNER=pytorch 2024-12-18T03:40:44.6528364Z JOB_ID=34566687110 2024-12-18T03:40:44.6528583Z INSTALLED_PROTOBUF=yes 2024-12-18T03:40:44.6528817Z GITHUB_HEAD_REF= 2024-12-18T03:40:44.6529031Z GITHUB_ACTION_REF= 2024-12-18T03:40:44.6529303Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2024-12-18T03:40:44.6529631Z TEST_SHOWLOCALS=False 2024-12-18T03:40:44.6529871Z GITHUB_WORKFLOW=periodic 2024-12-18T03:40:44.6530126Z DEBIAN_FRONTEND=noninteractive 2024-12-18T03:40:44.6530702Z GITHUB_OUTPUT=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/set_output_d712ed0b-2726-4418-ba06-82fbe2ae25f1 2024-12-18T03:40:44.6531453Z NO_TD=False 2024-12-18T03:40:44.6531661Z OLDPWD=/var/lib/jenkins 2024-12-18T03:40:44.6531893Z _=/usr/bin/env 2024-12-18T03:40:44.6532141Z + echo 'Testing pytorch' 2024-12-18T03:40:44.6532423Z Testing pytorch 2024-12-18T03:40:44.6533135Z + export LANG=C.UTF-8 2024-12-18T03:40:44.6533659Z + LANG=C.UTF-8 2024-12-18T03:40:44.6534119Z + PR_NUMBER= 2024-12-18T03:40:44.6534794Z + [[ distributed == \d\e\f\a\u\l\t ]] 2024-12-18T03:40:44.6535387Z + [[ distributed == \d\i\s\t\r\i\b\u\t\e\d ]] 2024-12-18T03:40:44.6535997Z + [[ linux-focal-rocm6.2-py3.10 == *rocm* ]] 2024-12-18T03:40:44.6536595Z + export HIP_VISIBLE_DEVICES=0,1 2024-12-18T03:40:44.6537117Z + HIP_VISIBLE_DEVICES=0,1 2024-12-18T03:40:44.6537608Z + [[ distributed == \s\l\o\w ]] 2024-12-18T03:40:44.6538207Z + [[ linux-focal-rocm6.2-py3.10 == *slow-gradcheck* ]] 2024-12-18T03:40:44.6538888Z + [[ linux-focal-rocm6.2-py3.10 == *cuda* ]] 2024-12-18T03:40:44.6539496Z + [[ linux-focal-rocm6.2-py3.10 == *rocm* ]] 2024-12-18T03:40:44.6540118Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2024-12-18T03:40:44.6540747Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2024-12-18T03:40:44.6541318Z + [[ distributed == *crossref* ]] 2024-12-18T03:40:44.6541881Z + [[ linux-focal-rocm6.2-py3.10 == *rocm* ]] 2024-12-18T03:40:44.6542542Z + export VALGRIND=OFF 2024-12-18T03:40:44.6543057Z + VALGRIND=OFF 2024-12-18T03:40:44.6543522Z + rocminfo 2024-12-18T03:40:44.6798516Z ROCk module version 6.8.5 is loaded 2024-12-18T03:40:44.7431702Z ===================== 2024-12-18T03:40:44.7432375Z HSA System Attributes 2024-12-18T03:40:44.7432904Z ===================== 2024-12-18T03:40:44.7433375Z Runtime Version: 1.14 2024-12-18T03:40:44.7433923Z Runtime Ext Version: 1.6 2024-12-18T03:40:44.7434475Z System Timestamp Freq.: 1000.000000MHz 2024-12-18T03:40:44.7435353Z Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) 2024-12-18T03:40:44.7436329Z Machine Model: LARGE 2024-12-18T03:40:44.7437077Z System Endianness: LITTLE 2024-12-18T03:40:44.7437729Z Mwaitx: DISABLED 2024-12-18T03:40:44.7438246Z DMAbuf Support: YES 2024-12-18T03:40:44.7438557Z 2024-12-18T03:40:44.7438748Z ========== 2024-12-18T03:40:44.7439209Z HSA Agents 2024-12-18T03:40:44.7439647Z ========== 2024-12-18T03:40:44.7440079Z ******* 2024-12-18T03:40:44.7440512Z Agent 1 2024-12-18T03:40:44.7440941Z ******* 2024-12-18T03:40:44.7441490Z Name: AMD EPYC 7513 32-Core Processor 2024-12-18T03:40:44.7442271Z Uuid: CPU-XX 2024-12-18T03:40:44.7443116Z Marketing Name: AMD EPYC 7513 32-Core Processor 2024-12-18T03:40:44.7444021Z Vendor Name: CPU 2024-12-18T03:40:44.7444851Z Feature: None specified 2024-12-18T03:40:44.7445668Z Profile: FULL_PROFILE 2024-12-18T03:40:44.7446512Z Float Round Mode: NEAR 2024-12-18T03:40:44.7447376Z Max Queue Number: 0(0x0) 2024-12-18T03:40:44.7448209Z Queue Min Size: 0(0x0) 2024-12-18T03:40:44.7448958Z Queue Max Size: 0(0x0) 2024-12-18T03:40:44.7449636Z Queue Type: MULTI 2024-12-18T03:40:44.7450280Z Node: 0 2024-12-18T03:40:44.7450932Z Device Type: CPU 2024-12-18T03:40:44.7451550Z Cache Info: 2024-12-18T03:40:44.7452078Z L1: 32768(0x8000) KB 2024-12-18T03:40:44.7453252Z Chip ID: 0(0x0) 2024-12-18T03:40:44.7453992Z ASIC Revision: 0(0x0) 2024-12-18T03:40:44.7455045Z Cacheline Size: 64(0x40) 2024-12-18T03:40:44.7455869Z Max Clock Freq. (MHz): 2600 2024-12-18T03:40:44.7456821Z BDFID: 0 2024-12-18T03:40:44.7457487Z Internal Node ID: 0 2024-12-18T03:40:44.7458178Z Compute Unit: 32 2024-12-18T03:40:44.7458851Z SIMDs per CU: 0 2024-12-18T03:40:44.7459546Z Shader Engines: 0 2024-12-18T03:40:44.7460257Z Shader Arrs. per Eng.: 0 2024-12-18T03:40:44.7460990Z WatchPts on Addr. Ranges:1 2024-12-18T03:40:44.7461654Z Memory Properties: 2024-12-18T03:40:44.7462162Z Features: None 2024-12-18T03:40:44.7462642Z Pool Info: 2024-12-18T03:40:44.7463115Z Pool 1 2024-12-18T03:40:44.7463726Z Segment: GLOBAL; FLAGS: FINE GRAINED 2024-12-18T03:40:44.7464439Z Size: 65839404(0x3eca12c) KB 2024-12-18T03:40:44.7465132Z Allocatable: TRUE 2024-12-18T03:40:44.7465847Z Alloc Granule: 4KB 2024-12-18T03:40:44.7466598Z Alloc Recommended Granule:4KB 2024-12-18T03:40:44.7467351Z Alloc Alignment: 4KB 2024-12-18T03:40:44.7468081Z Accessible by all: TRUE 2024-12-18T03:40:44.7468715Z Pool 2 2024-12-18T03:40:44.7469302Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2024-12-18T03:40:44.7469995Z Size: 65839404(0x3eca12c) KB 2024-12-18T03:40:44.7470681Z Allocatable: TRUE 2024-12-18T03:40:44.7471391Z Alloc Granule: 4KB 2024-12-18T03:40:44.7472137Z Alloc Recommended Granule:4KB 2024-12-18T03:40:44.7473031Z Alloc Alignment: 4KB 2024-12-18T03:40:44.7473893Z Accessible by all: TRUE 2024-12-18T03:40:44.7474628Z Pool 3 2024-12-18T03:40:44.7475277Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2024-12-18T03:40:44.7475957Z Size: 65839404(0x3eca12c) KB 2024-12-18T03:40:44.7476625Z Allocatable: TRUE 2024-12-18T03:40:44.7477328Z Alloc Granule: 4KB 2024-12-18T03:40:44.7478076Z Alloc Recommended Granule:4KB 2024-12-18T03:40:44.7478823Z Alloc Alignment: 4KB 2024-12-18T03:40:44.7479549Z Accessible by all: TRUE 2024-12-18T03:40:44.7480175Z ISA Info: 2024-12-18T03:40:44.7480646Z ******* 2024-12-18T03:40:44.7481101Z Agent 2 2024-12-18T03:40:44.7481537Z ******* 2024-12-18T03:40:44.7482069Z Name: AMD EPYC 7513 32-Core Processor 2024-12-18T03:40:44.7482844Z Uuid: CPU-XX 2024-12-18T03:40:44.7483540Z Marketing Name: AMD EPYC 7513 32-Core Processor 2024-12-18T03:40:44.7484263Z Vendor Name: CPU 2024-12-18T03:40:44.7484951Z Feature: None specified 2024-12-18T03:40:44.7485648Z Profile: FULL_PROFILE 2024-12-18T03:40:44.7486670Z Float Round Mode: NEAR 2024-12-18T03:40:44.7487371Z Max Queue Number: 0(0x0) 2024-12-18T03:40:44.7488057Z Queue Min Size: 0(0x0) 2024-12-18T03:40:44.7488967Z Queue Max Size: 0(0x0) 2024-12-18T03:40:44.7489654Z Queue Type: MULTI 2024-12-18T03:40:44.7490299Z Node: 1 2024-12-18T03:40:44.7490948Z Device Type: CPU 2024-12-18T03:40:44.7491556Z Cache Info: 2024-12-18T03:40:44.7492076Z L1: 32768(0x8000) KB 2024-12-18T03:40:44.7492700Z Chip ID: 0(0x0) 2024-12-18T03:40:44.7493361Z ASIC Revision: 0(0x0) 2024-12-18T03:40:44.7494072Z Cacheline Size: 64(0x40) 2024-12-18T03:40:44.7494886Z Max Clock Freq. (MHz): 2600 2024-12-18T03:40:44.7495543Z BDFID: 0 2024-12-18T03:40:44.7496200Z Internal Node ID: 1 2024-12-18T03:40:44.7496897Z Compute Unit: 32 2024-12-18T03:40:44.7497574Z SIMDs per CU: 0 2024-12-18T03:40:44.7498252Z Shader Engines: 0 2024-12-18T03:40:44.7498968Z Shader Arrs. per Eng.: 0 2024-12-18T03:40:44.7499704Z WatchPts on Addr. Ranges:1 2024-12-18T03:40:44.7500346Z Memory Properties: 2024-12-18T03:40:44.7500831Z Features: None 2024-12-18T03:40:44.7501319Z Pool Info: 2024-12-18T03:40:44.7501793Z Pool 1 2024-12-18T03:40:44.7502444Z Segment: GLOBAL; FLAGS: FINE GRAINED 2024-12-18T03:40:44.7503268Z Size: 65997864(0x3ef0c28) KB 2024-12-18T03:40:44.7504077Z Allocatable: TRUE 2024-12-18T03:40:44.7504932Z Alloc Granule: 4KB 2024-12-18T03:40:44.7505823Z Alloc Recommended Granule:4KB 2024-12-18T03:40:44.7506736Z Alloc Alignment: 4KB 2024-12-18T03:40:44.7507632Z Accessible by all: TRUE 2024-12-18T03:40:44.7508380Z Pool 2 2024-12-18T03:40:44.7509018Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2024-12-18T03:40:44.7509711Z Size: 65997864(0x3ef0c28) KB 2024-12-18T03:40:44.7510403Z Allocatable: TRUE 2024-12-18T03:40:44.7511131Z Alloc Granule: 4KB 2024-12-18T03:40:44.7511985Z Alloc Recommended Granule:4KB 2024-12-18T03:40:44.7512750Z Alloc Alignment: 4KB 2024-12-18T03:40:44.7513485Z Accessible by all: TRUE 2024-12-18T03:40:44.7514118Z Pool 3 2024-12-18T03:40:44.7514697Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2024-12-18T03:40:44.7515372Z Size: 65997864(0x3ef0c28) KB 2024-12-18T03:40:44.7516033Z Allocatable: TRUE 2024-12-18T03:40:44.7516737Z Alloc Granule: 4KB 2024-12-18T03:40:44.7517474Z Alloc Recommended Granule:4KB 2024-12-18T03:40:44.7518217Z Alloc Alignment: 4KB 2024-12-18T03:40:44.7519263Z Accessible by all: TRUE 2024-12-18T03:40:44.7519894Z ISA Info: 2024-12-18T03:40:44.7520349Z ******* 2024-12-18T03:40:44.7520794Z Agent 3 2024-12-18T03:40:44.7521232Z ******* 2024-12-18T03:40:44.7522010Z Name: gfx90a 2024-12-18T03:40:44.7522675Z Uuid: GPU-48775b8a453f84af 2024-12-18T03:40:44.7523377Z Marketing Name: AMD Instinct MI210 2024-12-18T03:40:44.7524091Z Vendor Name: AMD 2024-12-18T03:40:44.7524897Z Feature: KERNEL_DISPATCH 2024-12-18T03:40:44.7525740Z Profile: BASE_PROFILE 2024-12-18T03:40:44.7526518Z Float Round Mode: NEAR 2024-12-18T03:40:44.7527218Z Max Queue Number: 128(0x80) 2024-12-18T03:40:44.7527914Z Queue Min Size: 64(0x40) 2024-12-18T03:40:44.7528594Z Queue Max Size: 131072(0x20000) 2024-12-18T03:40:44.7529265Z Queue Type: MULTI 2024-12-18T03:40:44.7529919Z Node: 2 2024-12-18T03:40:44.7530566Z Device Type: GPU 2024-12-18T03:40:44.7531168Z Cache Info: 2024-12-18T03:40:44.7531673Z L1: 16(0x10) KB 2024-12-18T03:40:44.7532313Z L2: 8192(0x2000) KB 2024-12-18T03:40:44.7533056Z Chip ID: 29711(0x740f) 2024-12-18T03:40:44.7533846Z ASIC Revision: 1(0x1) 2024-12-18T03:40:44.7534786Z Cacheline Size: 64(0x40) 2024-12-18T03:40:44.7535556Z Max Clock Freq. (MHz): 1700 2024-12-18T03:40:44.7536211Z BDFID: 768 2024-12-18T03:40:44.7536873Z Internal Node ID: 2 2024-12-18T03:40:44.7537567Z Compute Unit: 104 2024-12-18T03:40:44.7538244Z SIMDs per CU: 4 2024-12-18T03:40:44.7538925Z Shader Engines: 8 2024-12-18T03:40:44.7539630Z Shader Arrs. per Eng.: 1 2024-12-18T03:40:44.7540359Z WatchPts on Addr. Ranges:4 2024-12-18T03:40:44.7541095Z Coherent Host Access: FALSE 2024-12-18T03:40:44.7541738Z Memory Properties: 2024-12-18T03:40:44.7542300Z Features: KERNEL_DISPATCH 2024-12-18T03:40:44.7543086Z Fast F16 Operation: TRUE 2024-12-18T03:40:44.7543946Z Wavefront Size: 64(0x40) 2024-12-18T03:40:44.7544801Z Workgroup Max Size: 1024(0x400) 2024-12-18T03:40:44.7545581Z Workgroup Max Size per Dimension: 2024-12-18T03:40:44.7546265Z x 1024(0x400) 2024-12-18T03:40:44.7546985Z y 1024(0x400) 2024-12-18T03:40:44.7547673Z z 1024(0x400) 2024-12-18T03:40:44.7548410Z Max Waves Per CU: 32(0x20) 2024-12-18T03:40:44.7549125Z Max Work-item Per CU: 2048(0x800) 2024-12-18T03:40:44.7549834Z Grid Max Size: 4294967295(0xffffffff) 2024-12-18T03:40:44.7550474Z Grid Max Size per Dimension: 2024-12-18T03:40:44.7550996Z x 4294967295(0xffffffff) 2024-12-18T03:40:44.7551940Z y 4294967295(0xffffffff) 2024-12-18T03:40:44.7552531Z z 4294967295(0xffffffff) 2024-12-18T03:40:44.7553210Z Max fbarriers/Workgrp: 32 2024-12-18T03:40:44.7554021Z Packet Processor uCode:: 83 2024-12-18T03:40:44.7555047Z SDMA engine uCode:: 8 2024-12-18T03:40:44.7555776Z IOMMU Support:: None 2024-12-18T03:40:44.7556409Z Pool Info: 2024-12-18T03:40:44.7556883Z Pool 1 2024-12-18T03:40:44.7557478Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2024-12-18T03:40:44.7558176Z Size: 67092480(0x3ffc000) KB 2024-12-18T03:40:44.7558860Z Allocatable: TRUE 2024-12-18T03:40:44.7559573Z Alloc Granule: 4KB 2024-12-18T03:40:44.7560324Z Alloc Recommended Granule:2048KB 2024-12-18T03:40:44.7561074Z Alloc Alignment: 4KB 2024-12-18T03:40:44.7561814Z Accessible by all: FALSE 2024-12-18T03:40:44.7562508Z Pool 2 2024-12-18T03:40:44.7563212Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2024-12-18T03:40:44.7564038Z Size: 67092480(0x3ffc000) KB 2024-12-18T03:40:44.7564868Z Allocatable: TRUE 2024-12-18T03:40:44.7565635Z Alloc Granule: 4KB 2024-12-18T03:40:44.7566380Z Alloc Recommended Granule:2048KB 2024-12-18T03:40:44.7567130Z Alloc Alignment: 4KB 2024-12-18T03:40:44.7567860Z Accessible by all: FALSE 2024-12-18T03:40:44.7568487Z Pool 3 2024-12-18T03:40:44.7569061Z Segment: GLOBAL; FLAGS: FINE GRAINED 2024-12-18T03:40:44.7569727Z Size: 67092480(0x3ffc000) KB 2024-12-18T03:40:44.7570385Z Allocatable: TRUE 2024-12-18T03:40:44.7571093Z Alloc Granule: 4KB 2024-12-18T03:40:44.7571831Z Alloc Recommended Granule:2048KB 2024-12-18T03:40:44.7572575Z Alloc Alignment: 4KB 2024-12-18T03:40:44.7573298Z Accessible by all: FALSE 2024-12-18T03:40:44.7573921Z Pool 4 2024-12-18T03:40:44.7574468Z Segment: GROUP 2024-12-18T03:40:44.7575198Z Size: 64(0x40) KB 2024-12-18T03:40:44.7575858Z Allocatable: FALSE 2024-12-18T03:40:44.7576571Z Alloc Granule: 0KB 2024-12-18T03:40:44.7577304Z Alloc Recommended Granule:0KB 2024-12-18T03:40:44.7578036Z Alloc Alignment: 0KB 2024-12-18T03:40:44.7578765Z Accessible by all: FALSE 2024-12-18T03:40:44.7579390Z ISA Info: 2024-12-18T03:40:44.7579855Z ISA 1 2024-12-18T03:40:44.7580457Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2024-12-18T03:40:44.7581225Z Machine Models: HSA_MACHINE_MODEL_LARGE 2024-12-18T03:40:44.7581986Z Profiles: HSA_PROFILE_BASE 2024-12-18T03:40:44.7582857Z Default Rounding Mode: NEAR 2024-12-18T03:40:44.7583734Z Default Rounding Mode: NEAR 2024-12-18T03:40:44.7584932Z Fast f16: TRUE 2024-12-18T03:40:44.7585751Z Workgroup Max Size: 1024(0x400) 2024-12-18T03:40:44.7586533Z Workgroup Max Size per Dimension: 2024-12-18T03:40:44.7587225Z x 1024(0x400) 2024-12-18T03:40:44.7588266Z y 1024(0x400) 2024-12-18T03:40:44.7588860Z z 1024(0x400) 2024-12-18T03:40:44.7589499Z Grid Max Size: 4294967295(0xffffffff) 2024-12-18T03:40:44.7590129Z Grid Max Size per Dimension: 2024-12-18T03:40:44.7590691Z x 4294967295(0xffffffff) 2024-12-18T03:40:44.7591283Z y 4294967295(0xffffffff) 2024-12-18T03:40:44.7591876Z z 4294967295(0xffffffff) 2024-12-18T03:40:44.7592543Z FBarrier Max Size: 32 2024-12-18T03:40:44.7593161Z ******* 2024-12-18T03:40:44.7593605Z Agent 4 2024-12-18T03:40:44.7594046Z ******* 2024-12-18T03:40:44.7594542Z Name: gfx90a 2024-12-18T03:40:44.7595203Z Uuid: GPU-85feac2c39886449 2024-12-18T03:40:44.7595904Z Marketing Name: AMD Instinct MI210 2024-12-18T03:40:44.7596622Z Vendor Name: AMD 2024-12-18T03:40:44.7597306Z Feature: KERNEL_DISPATCH 2024-12-18T03:40:44.7597991Z Profile: BASE_PROFILE 2024-12-18T03:40:44.7598684Z Float Round Mode: NEAR 2024-12-18T03:40:44.7599384Z Max Queue Number: 128(0x80) 2024-12-18T03:40:44.7600083Z Queue Min Size: 64(0x40) 2024-12-18T03:40:44.7600755Z Queue Max Size: 131072(0x20000) 2024-12-18T03:40:44.7601427Z Queue Type: MULTI 2024-12-18T03:40:44.7602059Z Node: 3 2024-12-18T03:40:44.7602698Z Device Type: GPU 2024-12-18T03:40:44.7603299Z Cache Info: 2024-12-18T03:40:44.7603802Z L1: 16(0x10) KB 2024-12-18T03:40:44.7604395Z L2: 8192(0x2000) KB 2024-12-18T03:40:44.7605008Z Chip ID: 29711(0x740f) 2024-12-18T03:40:44.7605673Z ASIC Revision: 1(0x1) 2024-12-18T03:40:44.7606378Z Cacheline Size: 64(0x40) 2024-12-18T03:40:44.7607077Z Max Clock Freq. (MHz): 1700 2024-12-18T03:40:44.7607744Z BDFID: 33536 2024-12-18T03:40:44.7608401Z Internal Node ID: 3 2024-12-18T03:40:44.7609088Z Compute Unit: 104 2024-12-18T03:40:44.7609764Z SIMDs per CU: 4 2024-12-18T03:40:44.7610437Z Shader Engines: 8 2024-12-18T03:40:44.7611145Z Shader Arrs. per Eng.: 1 2024-12-18T03:40:44.7611872Z WatchPts on Addr. Ranges:4 2024-12-18T03:40:44.7612713Z Coherent Host Access: FALSE 2024-12-18T03:40:44.7613488Z Memory Properties: 2024-12-18T03:40:44.7614098Z Features: KERNEL_DISPATCH 2024-12-18T03:40:44.7614988Z Fast F16 Operation: TRUE 2024-12-18T03:40:44.7615716Z Wavefront Size: 64(0x40) 2024-12-18T03:40:44.7616743Z Workgroup Max Size: 1024(0x400) 2024-12-18T03:40:44.7617389Z Workgroup Max Size per Dimension: 2024-12-18T03:40:44.7617937Z x 1024(0x400) 2024-12-18T03:40:44.7618774Z y 1024(0x400) 2024-12-18T03:40:44.7619347Z z 1024(0x400) 2024-12-18T03:40:44.7619971Z Max Waves Per CU: 32(0x20) 2024-12-18T03:40:44.7620675Z Max Work-item Per CU: 2048(0x800) 2024-12-18T03:40:44.7621379Z Grid Max Size: 4294967295(0xffffffff) 2024-12-18T03:40:44.7622003Z Grid Max Size per Dimension: 2024-12-18T03:40:44.7622611Z x 4294967295(0xffffffff) 2024-12-18T03:40:44.7623303Z y 4294967295(0xffffffff) 2024-12-18T03:40:44.7623999Z z 4294967295(0xffffffff) 2024-12-18T03:40:44.7624786Z Max fbarriers/Workgrp: 32 2024-12-18T03:40:44.7625682Z Packet Processor uCode:: 83 2024-12-18T03:40:44.7626557Z SDMA engine uCode:: 8 2024-12-18T03:40:44.7627414Z IOMMU Support:: None 2024-12-18T03:40:44.7628137Z Pool Info: 2024-12-18T03:40:44.7628636Z Pool 1 2024-12-18T03:40:44.7629217Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2024-12-18T03:40:44.7629910Z Size: 67092480(0x3ffc000) KB 2024-12-18T03:40:44.7630597Z Allocatable: TRUE 2024-12-18T03:40:44.7631307Z Alloc Granule: 4KB 2024-12-18T03:40:44.7632040Z Alloc Recommended Granule:2048KB 2024-12-18T03:40:44.7632801Z Alloc Alignment: 4KB 2024-12-18T03:40:44.7633534Z Accessible by all: FALSE 2024-12-18T03:40:44.7634287Z Pool 2 2024-12-18T03:40:44.7634961Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2024-12-18T03:40:44.7635780Z Size: 67092480(0x3ffc000) KB 2024-12-18T03:40:44.7636549Z Allocatable: TRUE 2024-12-18T03:40:44.7637246Z Alloc Granule: 4KB 2024-12-18T03:40:44.7637975Z Alloc Recommended Granule:2048KB 2024-12-18T03:40:44.7638709Z Alloc Alignment: 4KB 2024-12-18T03:40:44.7639430Z Accessible by all: FALSE 2024-12-18T03:40:44.7640047Z Pool 3 2024-12-18T03:40:44.7640622Z Segment: GLOBAL; FLAGS: FINE GRAINED 2024-12-18T03:40:44.7641287Z Size: 67092480(0x3ffc000) KB 2024-12-18T03:40:44.7641959Z Allocatable: TRUE 2024-12-18T03:40:44.7642663Z Alloc Granule: 4KB 2024-12-18T03:40:44.7643395Z Alloc Recommended Granule:2048KB 2024-12-18T03:40:44.7644131Z Alloc Alignment: 4KB 2024-12-18T03:40:44.7644853Z Accessible by all: FALSE 2024-12-18T03:40:44.7645469Z Pool 4 2024-12-18T03:40:44.7646011Z Segment: GROUP 2024-12-18T03:40:44.7646647Z Size: 64(0x40) KB 2024-12-18T03:40:44.7647312Z Allocatable: FALSE 2024-12-18T03:40:44.7648268Z Alloc Granule: 0KB 2024-12-18T03:40:44.7648999Z Alloc Recommended Granule:0KB 2024-12-18T03:40:44.7649738Z Alloc Alignment: 0KB 2024-12-18T03:40:44.7650455Z Accessible by all: FALSE 2024-12-18T03:40:44.7651343Z ISA Info: 2024-12-18T03:40:44.7651803Z ISA 1 2024-12-18T03:40:44.7652443Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2024-12-18T03:40:44.7653344Z Machine Models: HSA_MACHINE_MODEL_LARGE 2024-12-18T03:40:44.7654218Z Profiles: HSA_PROFILE_BASE 2024-12-18T03:40:44.7655192Z Default Rounding Mode: NEAR 2024-12-18T03:40:44.7656077Z Default Rounding Mode: NEAR 2024-12-18T03:40:44.7656903Z Fast f16: TRUE 2024-12-18T03:40:44.7657744Z Workgroup Max Size: 1024(0x400) 2024-12-18T03:40:44.7658497Z Workgroup Max Size per Dimension: 2024-12-18T03:40:44.7659084Z x 1024(0x400) 2024-12-18T03:40:44.7659691Z y 1024(0x400) 2024-12-18T03:40:44.7660274Z z 1024(0x400) 2024-12-18T03:40:44.7660916Z Grid Max Size: 4294967295(0xffffffff) 2024-12-18T03:40:44.7661551Z Grid Max Size per Dimension: 2024-12-18T03:40:44.7662100Z x 4294967295(0xffffffff) 2024-12-18T03:40:44.7662701Z y 4294967295(0xffffffff) 2024-12-18T03:40:44.7663298Z z 4294967295(0xffffffff) 2024-12-18T03:40:44.7663962Z FBarrier Max Size: 32 2024-12-18T03:40:44.7664594Z *** Done *** 2024-12-18T03:40:44.7665054Z + rocminfo 2024-12-18T03:40:44.7665480Z + grep -E 'Name:.*\sgfx|Marketing' 2024-12-18T03:40:44.8257301Z Marketing Name: AMD EPYC 7513 32-Core Processor 2024-12-18T03:40:44.8258093Z Marketing Name: AMD EPYC 7513 32-Core Processor 2024-12-18T03:40:44.8258829Z Name: gfx90a 2024-12-18T03:40:44.8259514Z Marketing Name: AMD Instinct MI210 2024-12-18T03:40:44.8260208Z Name: gfx90a 2024-12-18T03:40:44.8260888Z Marketing Name: AMD Instinct MI210 2024-12-18T03:40:44.8430883Z + [[ linux-focal-rocm6.2-py3.10 == *xpu* ]] 2024-12-18T03:40:44.8431542Z + [[ linux-focal-rocm6.2-py3.10 != *-bazel-* ]] 2024-12-18T03:40:44.8432196Z + pip_install --user ninja==1.10.2 2024-12-18T03:40:44.8432892Z + pip_install_pkg='python3 -m pip install --progress-bar off' 2024-12-18T03:40:44.8433766Z + python3 -m pip install --progress-bar off --user ninja==1.10.2 2024-12-18T03:40:45.3964971Z Collecting ninja==1.10.2 2024-12-18T03:40:45.4334811Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (5.0 kB) 2024-12-18T03:40:45.4456594Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2024-12-18T03:40:45.7249905Z Installing collected packages: ninja 2024-12-18T03:40:45.7318762Z  WARNING: The script ninja is installed in '/var/lib/jenkins/.local/bin' which is not on PATH. 2024-12-18T03:40:45.7320514Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2024-12-18T03:40:45.7345476Z Successfully installed ninja-1.10.2 2024-12-18T03:40:45.7961858Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2024-12-18T03:40:45.7965827Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2024-12-18T03:40:45.7968115Z + [[ linux-focal-rocm6.2-py3.10 == *aarch64* ]] 2024-12-18T03:40:45.7968754Z + install_tlparse 2024-12-18T03:40:45.7969246Z + pip_install --user tlparse==0.3.25 2024-12-18T03:40:45.7969977Z + pip_install_pkg='python3 -m pip install --progress-bar off' 2024-12-18T03:40:45.7970851Z + python3 -m pip install --progress-bar off --user tlparse==0.3.25 2024-12-18T03:40:46.1370527Z Collecting tlparse==0.3.25 2024-12-18T03:40:46.1594990Z Downloading tlparse-0.3.25-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.7 kB) 2024-12-18T03:40:46.1685950Z Downloading tlparse-0.3.25-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.2 MB) 2024-12-18T03:40:46.4959851Z Installing collected packages: tlparse 2024-12-18T03:40:46.5255475Z Successfully installed tlparse-0.3.25 2024-12-18T03:40:46.5922518Z ++ python -m site --user-base 2024-12-18T03:40:46.6150405Z + PATH=/var/lib/jenkins/.local/bin:/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2024-12-18T03:40:46.6152664Z + [[ linux-focal-rocm6.2-py3.10 == *asan* ]] 2024-12-18T03:40:46.6153325Z + [[ linux-focal-rocm6.2-py3.10 == *-debug* ]] 2024-12-18T03:40:46.6153993Z + [[ linux-focal-rocm6.2-py3.10 != *-bazel-* ]] 2024-12-18T03:40:46.6154920Z + echo 'We are not in debug mode: linux-focal-rocm6.2-py3.10. Expect the assertion to pass' 2024-12-18T03:40:46.6156055Z We are not in debug mode: linux-focal-rocm6.2-py3.10. Expect the assertion to pass 2024-12-18T03:40:46.6156991Z + cd test 2024-12-18T03:40:46.6157758Z + python -c 'import torch; torch._C._crash_if_debug_asserts_fail(424242)' 2024-12-18T03:40:51.0278681Z + [[ distributed == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2024-12-18T03:40:51.0279372Z + [[ distributed == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2024-12-18T03:40:51.0282971Z + DYNAMO_BENCHMARK_FLAGS=() 2024-12-18T03:40:51.0283552Z + [[ distributed == *pr_time_benchmarks* ]] 2024-12-18T03:40:51.0284181Z + [[ distributed == *dynamo_eager* ]] 2024-12-18T03:40:51.0284809Z + [[ distributed == *aot_eager* ]] 2024-12-18T03:40:51.0285375Z + [[ distributed == *aot_inductor* ]] 2024-12-18T03:40:51.0285934Z + [[ distributed == *inductor* ]] 2024-12-18T03:40:51.0286472Z + [[ distributed == *dynamic* ]] 2024-12-18T03:40:51.0286989Z + [[ distributed == *cpu* ]] 2024-12-18T03:40:51.0287543Z + DYNAMO_BENCHMARK_FLAGS+=(--device cuda) 2024-12-18T03:40:51.0313562Z + [[ linux-focal-rocm6.2-py3.10 == *libtorch* ]] 2024-12-18T03:40:51.0314027Z + [[ linux-focal-rocm6.2-py3.10 == *-bazel-* ]] 2024-12-18T03:40:51.0318485Z + cd test 2024-12-18T03:40:51.0318849Z + python -c 'import torch; print(torch.__config__.show())' 2024-12-18T03:40:53.2434871Z PyTorch built with: 2024-12-18T03:40:53.2435455Z - GCC 9.4 2024-12-18T03:40:53.2435916Z - C++ Version: 201703 2024-12-18T03:40:53.2436961Z - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications 2024-12-18T03:40:53.2438307Z - Intel(R) MKL-DNN v3.5.3 (Git Hash 66f0cb9eb66affd2da3bf5f8d897376f04aae6af) 2024-12-18T03:40:53.2439110Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2024-12-18T03:40:53.2439749Z - LAPACK is enabled (usually provided by MKL) 2024-12-18T03:40:53.2440354Z - NNPACK is enabled 2024-12-18T03:40:53.2440839Z - CPU capability usage: AVX2 2024-12-18T03:40:53.2441359Z - HIP Runtime 6.2.41134 2024-12-18T03:40:53.2441847Z - MIOpen 3.2.0 2024-12-18T03:40:53.2442267Z - Magma 2.7.2 2024-12-18T03:40:53.2451228Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, COMMIT_SHA=0cdf8b1d09254cfda66191d1bd01e3041c3c76f7, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, TORCH_VERSION=2.6.0, USE_CUDA=OFF, USE_CUDNN=OFF, USE_CUSPARSELT=OFF, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=ON, USE_ROCM_KERNEL_ASSERT=OFF, 2024-12-18T03:40:53.2460309Z 2024-12-18T03:40:53.7591288Z + cd test 2024-12-18T03:40:53.7591696Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2024-12-18T03:40:54.9098192Z ATen/Parallel: 2024-12-18T03:40:54.9098807Z at::get_num_threads() : 64 2024-12-18T03:40:54.9099474Z at::get_num_interop_threads() : 64 2024-12-18T03:40:54.9100081Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2024-12-18T03:40:54.9100640Z omp_get_max_threads() : 64 2024-12-18T03:40:54.9101643Z Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications 2024-12-18T03:40:54.9102711Z mkl_get_max_threads() : 64 2024-12-18T03:40:54.9103425Z Intel(R) MKL-DNN v3.5.3 (Git Hash 66f0cb9eb66affd2da3bf5f8d897376f04aae6af) 2024-12-18T03:40:54.9104269Z std::thread::hardware_concurrency() : 64 2024-12-18T03:40:54.9104851Z Environment variables: 2024-12-18T03:40:54.9105330Z OMP_NUM_THREADS : [not set] 2024-12-18T03:40:54.9105852Z MKL_NUM_THREADS : [not set] 2024-12-18T03:40:54.9106370Z ATen parallel backend: OpenMP 2024-12-18T03:40:54.9106732Z 2024-12-18T03:40:56.4742917Z + [[ distributed == *numpy_2* ]] 2024-12-18T03:40:56.4743547Z + [[ linux-focal-rocm6.2-py3.10 == *aarch64* ]] 2024-12-18T03:40:56.4744108Z + [[ distributed == *backward* ]] 2024-12-18T03:40:56.4744650Z + [[ distributed == *xla* ]] 2024-12-18T03:40:56.4745117Z + [[ distributed == *executorch* ]] 2024-12-18T03:40:56.4745591Z + [[ distributed == \j\i\t\_\l\e\g\a\c\y ]] 2024-12-18T03:40:56.4746131Z + [[ linux-focal-rocm6.2-py3.10 == *libtorch* ]] 2024-12-18T03:40:56.4746659Z + [[ distributed == distributed ]] 2024-12-18T03:40:56.4747099Z + test_distributed 2024-12-18T03:40:56.4747513Z + echo 'Testing distributed python tests' 2024-12-18T03:40:56.4748037Z Testing distributed python tests 2024-12-18T03:40:56.4748672Z + python test/run_test.py --distributed-tests --shard 2 3 --verbose 2024-12-18T03:40:56.5700488Z /var/lib/jenkins/pytorch/test/run_test.py:22: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html 2024-12-18T03:40:56.5702106Z import pkg_resources 2024-12-18T03:41:05.3325619Z Excluding distributed/rpc/test_faulty_agent on ROCm 2024-12-18T03:41:05.3326557Z Excluding distributed/rpc/test_tensorpipe_agent on ROCm 2024-12-18T03:41:05.3327458Z Excluding distributed/rpc/test_share_memory on ROCm 2024-12-18T03:41:05.3328273Z Excluding distributed/rpc/cuda/test_tensorpipe_agent on ROCm 2024-12-18T03:41:05.3329218Z Excluding distributed/_shard/sharded_tensor/ops/test_embedding on ROCm 2024-12-18T03:41:05.3330232Z Excluding distributed/_shard/sharded_tensor/ops/test_embedding_bag on ROCm 2024-12-18T03:41:05.3331270Z Excluding distributed/_shard/sharded_tensor/ops/test_binary_cmp on ROCm 2024-12-18T03:41:05.3332241Z Excluding distributed/_shard/sharded_tensor/ops/test_init on ROCm 2024-12-18T03:41:05.3333184Z Excluding distributed/_shard/sharded_optim/test_sharded_optim on ROCm 2024-12-18T03:41:05.3335054Z Excluding distributed/_tensor/test_attention on ROCm 2024-12-18T03:41:06.3022713Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json?versionId=PhiMB7EP3187qvpKvnORewoK3InOIvX5 to /var/lib/jenkins/pytorch/test/.pytorch-disabled-tests.json 2024-12-18T03:41:06.9315220Z Ignoring disabled issues: [''] 2024-12-18T03:41:06.9613988Z Found test times from artifacts 2024-12-18T03:41:07.0196068Z Found test times from artifacts 2024-12-18T03:41:07.0211745Z Running all tests 2024-12-18T03:41:07.0294743Z Running parallel tests on 2 processes 2024-12-18T03:41:07.0297763Z Name: tests to run (est. time: 142.37min) 2024-12-18T03:41:07.0298355Z Serial tests (71): 2024-12-18T03:41:07.0298933Z distributed/_tools/test_runtime_estimator 1/1 2024-12-18T03:41:07.0299623Z distributed/test_device_mesh 1/1 2024-12-18T03:41:07.0300261Z distributed/_tensor/test_random_ops 1/1 2024-12-18T03:41:07.0300873Z distributed/launcher/test_run 1/1 2024-12-18T03:41:07.0301512Z distributed/_tensor/test_pointwise_ops 1/1 2024-12-18T03:41:07.0302236Z distributed/_composable/fsdp/test_fully_shard_init 1/1 2024-12-18T03:41:07.0302944Z distributed/_tensor/test_embedding_ops 1/1 2024-12-18T03:41:07.0303557Z distributed/_tensor/test_matrix_ops 1/1 2024-12-18T03:41:07.0304306Z distributed/_composable/fsdp/test_fully_shard_autograd 1/1 2024-12-18T03:41:07.0305043Z distributed/fsdp/test_fsdp_misc 1/1 2024-12-18T03:41:07.0305736Z distributed/_shard/sharded_tensor/ops/test_tensor_ops 1/1 2024-12-18T03:41:07.0306470Z distributed/test_compute_comm_reordering 1/1 2024-12-18T03:41:07.0307109Z distributed/_tensor/test_dtensor 1/1 2024-12-18T03:41:07.0307715Z distributed/optim/test_named_optimizer 1/1 2024-12-18T03:41:07.0308316Z distributed/test_fake_pg 1/1 2024-12-18T03:41:07.0308944Z distributed/checkpoint/test_fsdp_optim_state 1/1 2024-12-18T03:41:07.0309682Z distributed/_tensor/experimental/test_local_map 1/1 2024-12-18T03:41:07.0310404Z distributed/_composable/test_checkpoint 1/1 2024-12-18T03:41:07.0311102Z distributed/checkpoint/test_dtensor_resharding 1/1 2024-12-18T03:41:07.0311778Z distributed/test_distributed_spawn 1/12 2024-12-18T03:41:07.0312395Z distributed/test_distributed_spawn 4/12 2024-12-18T03:41:07.0312991Z distributed/test_distributed_spawn 7/12 2024-12-18T03:41:07.0313610Z distributed/test_distributed_spawn 10/12 2024-12-18T03:41:07.0314232Z distributed/_tensor/test_redistribute 1/1 2024-12-18T03:41:07.0314882Z distributed/_tools/test_fsdp2_mem_tracker 1/1 2024-12-18T03:41:07.0315599Z distributed/checkpoint/e2e/test_e2e_save_and_load 1/1 2024-12-18T03:41:07.0316305Z distributed/checkpoint/test_format_utils 1/1 2024-12-18T03:41:07.0316982Z distributed/checkpoint/e2e/test_fine_tuning 1/1 2024-12-18T03:41:07.0317723Z distributed/_tensor/experimental/test_tp_transform 1/1 2024-12-18T03:41:07.0318454Z distributed/checkpoint/test_traverse 1/1 2024-12-18T03:41:07.0319142Z distributed/tensor/parallel/test_tp_random_state 1/1 2024-12-18T03:41:07.0319839Z distributed/elastic/test_control_plane 1/1 2024-12-18T03:41:07.0320558Z distributed/_composable/test_replicate_with_compiler 1/1 2024-12-18T03:41:07.0321250Z distributed/test_nccl 1/1 2024-12-18T03:41:07.0321787Z distributed/test_functional_api 1/1 2024-12-18T03:41:07.0322482Z distributed/optim/test_apply_optimizer_in_backward 1/1 2024-12-18T03:41:07.0323185Z distributed/fsdp/test_fsdp_state_dict 2/3 2024-12-18T03:41:07.0323934Z distributed/_composable/fsdp/test_fully_shard_grad_scaler 1/1 2024-12-18T03:41:07.0324682Z distributed/checkpoint/test_utils 1/1 2024-12-18T03:41:07.0325277Z distributed/_tensor/test_utils 1/1 2024-12-18T03:41:07.0325853Z distributed/test_c10d_nccl 2/3 2024-12-18T03:41:07.0326437Z distributed/fsdp/test_fsdp_optim_state 1/2 2024-12-18T03:41:07.0327080Z distributed/checkpoint/test_checkpoint 1/1 2024-12-18T03:41:07.0327730Z distributed/test_c10d_object_collectives 1/1 2024-12-18T03:41:07.0328852Z distributed/test_c10d_pypg 1/1 2024-12-18T03:41:07.0329496Z distributed/tensor/parallel/test_parallelize_api 1/1 2024-12-18T03:41:07.0330191Z distributed/fsdp/test_fsdp_traversal 1/1 2024-12-18T03:41:07.0330827Z distributed/checkpoint/test_state_dict 1/1 2024-12-18T03:41:07.0331842Z distributed/algorithms/ddp_comm_hooks/test_ddp_hooks 1/1 2024-12-18T03:41:07.0332587Z distributed/fsdp/test_fsdp_exec_order 1/1 2024-12-18T03:41:07.0333295Z distributed/_composable/fsdp/test_fully_shard_memory 1/1 2024-12-18T03:41:07.0334030Z distributed/fsdp/test_checkpoint_wrapper 1/1 2024-12-18T03:41:07.0334803Z distributed/fsdp/test_utils 1/1 2024-12-18T03:41:07.0335440Z distributed/fsdp/test_hsdp_dtensor_state_dict 1/1 2024-12-18T03:41:07.0336124Z distributed/fsdp/test_fsdp_tp_integration 1/1 2024-12-18T03:41:07.0336775Z distributed/fsdp/test_fsdp_checkpoint 1/1 2024-12-18T03:41:07.0337500Z distributed/_composable/fsdp/test_fully_shard_training 1/1 2024-12-18T03:41:07.0338218Z distributed/fsdp/test_fsdp_core 3/3 2024-12-18T03:41:07.0338966Z distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 1/1 2024-12-18T03:41:07.0339726Z distributed/test_launcher 1/1 2024-12-18T03:41:07.0340387Z distributed/_shard/sharded_tensor/test_sharded_tensor 1/1 2024-12-18T03:41:07.0341134Z distributed/fsdp/test_fsdp_mixed_precision 2/2 2024-12-18T03:41:07.0341780Z distributed/test_c10d_spawn_gloo 1/1 2024-12-18T03:41:07.0342351Z distributed/test_c10d_spawn_ucc 1/1 2024-12-18T03:41:07.0342912Z distributed/test_c10d_spawn_nccl 1/1 2024-12-18T03:41:07.0343504Z distributed/elastic/events/lib_test 1/1 2024-12-18T03:41:07.0344116Z distributed/elastic/metrics/api_test 1/1 2024-12-18T03:41:07.0344780Z distributed/elastic/multiprocessing/api_test 1/1 2024-12-18T03:41:07.0345522Z distributed/elastic/timer/local_timer_example 1/1 2024-12-18T03:41:07.0346195Z distributed/elastic/utils/logging_test 1/1 2024-12-18T03:41:07.0346822Z distributed/elastic/utils/util_test 1/1 2024-12-18T03:41:07.0347402Z Parallel tests (0): 2024-12-18T03:41:07.0347876Z Name: excluded (est. time: 0.0min) 2024-12-18T03:41:07.0348400Z Serial tests (0): 2024-12-18T03:41:07.0348835Z Parallel tests (0): 2024-12-18T03:41:07.0394400Z Running distributed/_tools/test_runtime_estimator 1/1 ... [2024-12-18 03:41:07.039056] 2024-12-18T03:41:07.0394981Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T03:41:07.0397641Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_tools/test_runtime_estimator.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 03:41:07.039332] 2024-12-18T03:41:55.5079091Z 2024-12-18T03:41:55.5080738Z distributed/_tools/test_runtime_estimator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tools.test_runtime_estimator_1.1_c7129f7d8fddd734_.log 2024-12-18T03:41:55.5083887Z Running 2 items in this shard: test/distributed/_tools/test_runtime_estimator.py::TestRuntimeEstimator::test_conv_model_runtime, test/distributed/_tools/test_runtime_estimator.py::TestRuntimeEstimator::test_transformer_runtime 2024-12-18T03:41:55.5085960Z 2024-12-18T03:41:55.5090409Z Running distributed/test_device_mesh 1/1 ... [2024-12-18 03:41:55.508524] 2024-12-18T03:41:55.5091461Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T03:41:55.5096778Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_device_mesh.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 03:41:55.509099] 2024-12-18T03:46:03.1651945Z 2024-12-18T03:46:03.1653484Z distributed/test_device_mesh 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_device_mesh_1.1_a01acd756aee3f38_.log 2024-12-18T03:46:03.1685335Z Running 47 items in this shard: test/distributed/test_device_mesh.py::DeviceMeshTestGlooBackend::test_device_mesh_reuse_default_group, test/distributed/test_device_mesh.py::DeviceMeshTest::test_2d_mesh_eager_init_subgroup, test/distributed/test_device_mesh.py::DeviceMeshTest::test_2d_mesh_non_eager_init_subgroup, test/distributed/test_device_mesh.py::DeviceMeshTest::test_assert_invalid_mesh_tensor, test/distributed/test_device_mesh.py::DeviceMeshTest::test_device_mesh_2d, test/distributed/test_device_mesh.py::DeviceMeshTest::test_device_mesh_init_backend, test/distributed/test_device_mesh.py::DeviceMeshTest::test_fake_pg_device_mesh, test/distributed/test_device_mesh.py::DeviceMeshTest::test_from_group_with_global_pg, test/distributed/test_device_mesh.py::DeviceMeshTest::test_from_group_with_invalid_mesh, test/distributed/test_device_mesh.py::DeviceMeshTest::test_get_group_and_get_all_groups, test/distributed/test_device_mesh.py::DeviceMeshTest::test_get_local_rank, test/distributed/test_device_mesh.py::DeviceMeshTest::test_get_local_rank_raises_exception, test/distributed/test_device_mesh.py::DeviceMeshTest::test_init_process_group, test/distributed/test_device_mesh.py::DeviceMeshTest::test_raises_invalid_device_type, test/distributed/test_device_mesh.py::DeviceMeshTest::test_set_mesh_dim_group_options, test/distributed/test_device_mesh.py::DeviceMeshTestNDim::test_device_mesh_hash, test/distributed/test_device_mesh.py::DeviceMeshTestNDim::test_device_mesh_nd, test/distributed/test_device_mesh.py::DeviceMeshTestNDim::test_device_mesh_parent_child_hash, test/distributed/test_device_mesh.py::DeviceMeshTestNDim::test_from_group_with_mesh_shape, test/distributed/test_device_mesh.py::DeviceMeshTestNDim::test_get_local_rank_3d, test/distributed/test_device_mesh.py::InitDeviceMeshTest::test_init_device_mesh, test/distributed/test_device_mesh.py::InitDeviceMeshTest::test_raises_duplicate_mesh_dim_names, test/distributed/test_device_mesh.py::InitDeviceMeshTest::test_raises_mesh_shape_mesh_dim_names_mismatch, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_cache_and_reuse_submesh_slice_result, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_flatten_mesh_3d, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_flatten_mesh_4d, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_get_item_1d, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_get_item_2d, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_get_item_3d, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_get_item_3d_noncontiguous_slicing, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_raises_invalid_mesh_dim_name, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_raises_no_mesh_dim_found, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_reconstruct_mesh_with_flatten_dim, test/distributed/test_device_mesh.py::TestMeshEnv::test_get_all_submeshes, test/distributed/test_device_mesh.py::TestMeshEnv::test_get_mesh_dim_by_name, test/distributed/test_device_mesh.py::TestMeshEnv::test_get_root_mesh, test/distributed/test_device_mesh.py::TestMeshEnv::test_get_root_mesh_dim_exist, test/distributed/test_device_mesh.py::TestMeshEnv::test_get_root_mesh_dim_not_exist, test/distributed/test_device_mesh.py::TestMeshEnv::test_mesh_slice_fake_tensor_mode, test/distributed/test_device_mesh.py::DeviceMeshCollectiveTest::test_all_gather_uneven, test/distributed/test_device_mesh.py::DeviceMeshCollectiveTest::test_broadcast_1d, test/distributed/test_device_mesh.py::DeviceMeshCollectiveTest::test_broadcast_nd, test/distributed/test_device_mesh.py::DeviceMeshCollectiveTest::test_reduce_scatter_contiguous, test/distributed/test_device_mesh.py::DeviceMeshCollectiveTest::test_reduce_scatter_uneven, test/distributed/test_device_mesh.py::DeviceMeshCollectiveTest::test_scatter_1d, test/distributed/test_device_mesh.py::DeviceMeshCollectiveTest::test_scatter_nd, test/distributed/test_device_mesh.py::DeviceMeshCollectiveTest::test_scatter_uneven 2024-12-18T03:46:03.1707996Z 2024-12-18T03:46:03.1708234Z Running distributed/_tensor/test_random_ops 1/1 ... [2024-12-18 03:46:03.165655] 2024-12-18T03:46:03.1708674Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T03:46:03.1709804Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_tensor/test_random_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 03:46:03.166284] 2024-12-18T03:47:15.1318357Z 2024-12-18T03:47:15.1320795Z distributed/_tensor/test_random_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tensor.test_random_ops_1.1_e5409f75bd4427be_.log 2024-12-18T03:47:15.1331471Z Running 11 items in this shard: test/distributed/_tensor/test_random_ops.py::DistTensorRandomInitTest::test_init_ops, test/distributed/_tensor/test_random_ops.py::DistTensorRandomOpTest::test_deterministic_dropout_1d, test/distributed/_tensor/test_random_ops.py::DistTensorRandomOpTest::test_deterministic_rand_1d, test/distributed/_tensor/test_random_ops.py::DistTensorRandomOpTest::test_deterministic_uniform_2d, test/distributed/_tensor/test_random_ops.py::DistTensorRandomOpTest::test_fsdp_tp_model_meta_init, test/distributed/_tensor/test_random_ops.py::DistTensorRandomOpTest::test_manual_seed, test/distributed/_tensor/test_random_ops.py::DistTensorRandomOpTest::test_manual_seed_submesh, test/distributed/_tensor/test_random_ops.py::DistTensorRandomOpTest::test_meta_tensor_init, test/distributed/_tensor/test_random_ops.py::DistTensorRandomOpTest::test_pipeline_parallel_manual_seed, test/distributed/_tensor/test_random_ops.py::DistTensorRandomOpTest::test_rng_tracker_init, test/distributed/_tensor/test_random_ops.py::DistTensorRandomOpTest::test_tp_model_meta_init 2024-12-18T03:47:15.1338159Z 2024-12-18T03:47:15.1338403Z Running distributed/launcher/test_run 1/1 ... [2024-12-18 03:47:15.132047] 2024-12-18T03:47:15.1338845Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T03:47:15.1339816Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/launcher/test_run.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 03:47:15.132611] 2024-12-18T03:48:03.7399108Z 2024-12-18T03:48:03.7400756Z distributed/launcher/test_run 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.launcher.test_run_1.1_a375c29c158316fc_.log 2024-12-18T03:48:03.7418718Z Running 24 items in this shard: test/distributed/launcher/test_run.py::ElasticLaunchTest::test_capture_logs_using_default_logs_specs, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_init_method_env_with_torchelastic, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_init_method_tcp_with_torchelastic, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_is_not_torchelastic_launched, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_is_torchelastic_launched, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_is_torchelastic_launched_with_logs_spec_defined, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_elastic, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_elastic_agent_raise_exception, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_elastic_multiple_agents, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_elastic_worker_raise_exception, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_run_path, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_shutdown, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_standalone, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_user_script_bash, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_user_script_default_nproc, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_user_script_python, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_user_script_python_caffe2_bc, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_launch_with_env_vars, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_logs_logs_spec_entrypoint_must_be_defined, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_min_max_nodes_parse, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_nproc_gpu_launch_configurations, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_nproc_launch_auto_configurations, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_nproc_launch_number_configurations, test/distributed/launcher/test_run.py::ElasticLaunchTest::test_nproc_launch_unknown_configurations 2024-12-18T03:48:03.7436867Z 2024-12-18T03:48:03.7437454Z Running distributed/_tensor/test_pointwise_ops 1/1 ... [2024-12-18 03:48:03.740638] 2024-12-18T03:48:03.7438444Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T03:48:03.7440385Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_tensor/test_pointwise_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 03:48:03.741199] 2024-12-18T03:48:09.6192276Z 2024-12-18T03:48:09.6193997Z distributed/_tensor/test_pointwise_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tensor.test_pointwise_ops_1.1_f228b6ce7f2c9700_.log 2024-12-18T03:48:09.6200014Z Running 7 items in this shard: test/distributed/_tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_activations, test/distributed/_tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_dropout, test/distributed/_tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_dropout_backward, test/distributed/_tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_dropout_errors, test/distributed/_tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_mul_out, test/distributed/_tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_partial_add, test/distributed/_tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_partial_mul 2024-12-18T03:48:09.6204766Z 2024-12-18T03:48:09.6205357Z Running distributed/_composable/fsdp/test_fully_shard_init 1/1 ... [2024-12-18 03:48:09.619524] 2024-12-18T03:48:09.6206435Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T03:48:09.6208823Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_init.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 03:48:09.620070] 2024-12-18T03:48:31.9312952Z 2024-12-18T03:48:31.9314882Z distributed/_composable/fsdp/test_fully_shard_init 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_init_1.1_1690669df6169792_.log 2024-12-18T03:48:31.9341777Z Running 38 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardDeviceTensor::test_move_states_to_device_tensor, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardDeviceDTensor::test_move_states_to_device_dtensor_invalid, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardDeviceDTensor::test_move_states_to_device_dtensor_valid, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMeshArg::test_2d_mesh_without_mesh_dim_names, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMeshArg::test_invalid_mesh_ndim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_duplicate, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_list_of_mlps, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_nested, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_nested_fully_shard_and_replicate, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_single, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_states_list_of_mlps, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_states_nested_fully_shard, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_states_shared_params_and_buffers, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardParamModuleInfos::test_get_param_module_infos_duplicates, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardParamModuleInfos::test_get_param_module_infos_list_of_mlps, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardParamModuleInfos::test_get_param_module_infos_shared_params, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardedParameterTensor::test_raise_noncontiguous_parameter, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardedParameterTensor::test_raise_scalar_parameter, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardedParameterTensor::test_shard_tensor_parameters, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardedParameterDTensor::test_shard_dtensor_parameters, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_fully_shard_double_lazy_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_fully_shard_is_root, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_fully_shard_module_and_param_fqns, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_fully_shard_multi_module_root, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_reset_sharded_param_in_lazy_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMetaDeviceInit::test_invalid_meta_device_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMetaDeviceInit::test_meta_device_1d_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMetaDeviceInit::test_meta_device_2d_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMetaDeviceInit::test_rank0_broadcast_meta_device_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardProcessGroupInit::test_1d_process_group_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardProcessGroupInit::test_2d_process_group_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardHSDPBroadcast::test_hsdp_broadcast_across_replicas, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_init_1d_transformer_shard_dim_neg1, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_init_1d_transformer_shard_largest_dim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_init_1d_uneven_shard_largest_dim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_init_2d_transformer_shard_diff_dim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_invalid_shard_dim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardOldImport::test_old_import_training 2024-12-18T03:48:31.9359693Z 2024-12-18T03:48:31.9359949Z Running distributed/_tensor/test_embedding_ops 1/1 ... [2024-12-18 03:48:31.931270] 2024-12-18T03:48:31.9360544Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T03:48:31.9361536Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_tensor/test_embedding_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 03:48:31.931848] 2024-12-18T03:49:02.0607688Z 2024-12-18T03:49:02.0609502Z distributed/_tensor/test_embedding_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tensor.test_embedding_ops_1.1_3ce11fd6eab727bf_.log 2024-12-18T03:49:02.0615457Z Running 4 items in this shard: test/distributed/_tensor/test_embedding_ops.py::TestEmbeddingOp::test_multiple_embeddings_rowwise, test/distributed/_tensor/test_embedding_ops.py::TestEmbeddingOp::test_sharded_embedding_colwise, test/distributed/_tensor/test_embedding_ops.py::TestEmbeddingOp::test_sharded_embedding_colwise_max_norm_errors, test/distributed/_tensor/test_embedding_ops.py::TestEmbeddingOp::test_sharded_embedding_rowwise 2024-12-18T03:49:02.0618670Z 2024-12-18T03:49:02.0619147Z Running distributed/_tensor/test_matrix_ops 1/1 ... [2024-12-18 03:49:02.060616] 2024-12-18T03:49:02.0620018Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T03:49:02.0622033Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_tensor/test_matrix_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 03:49:02.061175] 2024-12-18T03:50:43.3920190Z 2024-12-18T03:50:43.3921778Z distributed/_tensor/test_matrix_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tensor.test_matrix_ops_1.1_f58e5e1a758dfaa1_.log 2024-12-18T03:50:43.3929354Z Running 11 items in this shard: test/distributed/_tensor/test_matrix_ops.py::DistMatrixOpsTest::test_addmm, test/distributed/_tensor/test_matrix_ops.py::DistMatrixOpsTest::test_addmm_auto_redistribute, test/distributed/_tensor/test_matrix_ops.py::DistMatrixOpsTest::test_addmm_empty_operand, test/distributed/_tensor/test_matrix_ops.py::DistMatrixOpsTest::test_baddbmm, test/distributed/_tensor/test_matrix_ops.py::DistMatrixOpsTest::test_bmm, test/distributed/_tensor/test_matrix_ops.py::DistMatrixOpsTest::test_dtensor_mm, test/distributed/_tensor/test_matrix_ops.py::DistMatrixOpsTest::test_matmul, test/distributed/_tensor/test_matrix_ops.py::DistMatrixOpsTest::test_mm, test/distributed/_tensor/test_matrix_ops.py::DistMatrixOpsTest::test_scaled_dot_product_attention, test/distributed/_tensor/test_matrix_ops.py::DistMatrixOpsTest::test_t, test/distributed/_tensor/test_matrix_ops.py::DistMatrixOpsTest::test_t_partial 2024-12-18T03:50:43.3936526Z 2024-12-18T03:50:43.3937254Z Running distributed/_composable/fsdp/test_fully_shard_autograd 1/1 ... [2024-12-18 03:50:43.392260] 2024-12-18T03:50:43.3938321Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T03:50:43.3939450Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_autograd.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 03:50:43.392824] 2024-12-18T03:51:42.8864270Z 2024-12-18T03:51:42.8866558Z distributed/_composable/fsdp/test_fully_shard_autograd 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_autograd_1.1_df1aa640a35a1366_.log 2024-12-18T03:51:42.8872961Z Running 5 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_autograd.py::TestFullyShardAutograd::test_nontensor_activations, test/distributed/_composable/fsdp/test_fully_shard_autograd.py::TestFullyShardAutograd::test_unused_forward_module, test/distributed/_composable/fsdp/test_fully_shard_autograd.py::TestFullyShardAutograd::test_unused_forward_output, test/distributed/_composable/fsdp/test_fully_shard_autograd.py::TestFullyShardPostAccGradHookMultiThread::test_post_acc_grad_hook_runs, test/distributed/_composable/fsdp/test_fully_shard_autograd.py::TestFullyShardPostAccGradHookMultiProcess::test_post_acc_grad_hook_optim_parity 2024-12-18T03:51:42.8878437Z 2024-12-18T03:51:42.8879235Z Running distributed/fsdp/test_fsdp_misc 1/1 ... [2024-12-18 03:51:42.886625] 2024-12-18T03:51:42.8880085Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T03:51:42.8881960Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_misc.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 03:51:42.886942] 2024-12-18T03:53:58.0903376Z 2024-12-18T03:53:58.0908920Z distributed/fsdp/test_fsdp_misc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_misc_1.1_2d8885a27cea9e21_.log 2024-12-18T03:53:58.0921349Z Running 28 items in this shard: test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_cpu_init_with_sync_module_states, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_cpu_init_stays_on_cpu, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_cpu_training, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_device_id_use_index_False, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_device_id_use_index_True, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_module_no_compute_grad_use_second_layer_False_sharding_strategy0, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_module_no_compute_grad_use_second_layer_False_sharding_strategy1, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_module_no_compute_grad_use_second_layer_True_sharding_strategy0, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_module_no_compute_grad_use_second_layer_True_sharding_strategy1, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_not_all_outputs_used_in_loss, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_optim_overlap_no_use_orig_params_error, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_optimizer_overlap, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiProcess::test_fsdp_zero2_eval_with_prefetch, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_cpu_gpu_module, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_device_id_auto_wrap, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_fsdp_device_id_cpu_offload, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_fsdp_device_id_no_move_ignored_params_and_bufs, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_fsdp_ignored_module_meta, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_fsdp_namedtuple, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_fsdp_same_model_across_ranks, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_fsdp_unsupported_module_cls, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_homogeneous_attributes, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_module_device_mismatches_device_id, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_multigpu_module, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscMultiThread::test_no_params, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscWorldSize1::test_training_device_mismatch_errors, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscWorldSize1::test_unsafe_setattr, test/distributed/fsdp/test_fsdp_misc.py::TestFSDPMiscWorldSize1::test_world_size_1_sharding_strategy_warning 2024-12-18T03:53:58.0933471Z 2024-12-18T03:53:58.0934058Z Running distributed/_shard/sharded_tensor/ops/test_tensor_ops 1/1 ... [2024-12-18 03:53:58.090494] 2024-12-18T03:53:58.0935882Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T03:53:58.0938265Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_tensor_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 03:53:58.091068] 2024-12-18T03:54:28.5184193Z 2024-12-18T03:54:28.5189755Z distributed/_shard/sharded_tensor/ops/test_tensor_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.ops.test_tensor_ops_1.1_a9eb37d9d6bf671f_.log 2024-12-18T03:54:28.5194786Z Running 5 items in this shard: test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_clone, test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_deep_copy, test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_detach, test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_inplace_copy, test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_set_requires_grad 2024-12-18T03:54:28.5198337Z 2024-12-18T03:54:28.5198862Z Running distributed/test_compute_comm_reordering 1/1 ... [2024-12-18 03:54:28.518814] 2024-12-18T03:54:28.5199821Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T03:54:28.5202155Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_compute_comm_reordering.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 03:54:28.519380] 2024-12-18T03:55:52.0220617Z 2024-12-18T03:55:52.0229532Z distributed/test_compute_comm_reordering 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_compute_comm_reordering_1.1_6dbd4666a1c0907b_.log 2024-12-18T03:55:52.0237024Z Running 7 items in this shard: test/distributed/test_compute_comm_reordering.py::TestComputeCommReorderingMultiProc::test_grouped_scheduler_node, test/distributed/test_compute_comm_reordering.py::TestComputeCommReorderingMultiProc::test_nccl_heuristics, test/distributed/test_compute_comm_reordering.py::TestComputeCommReorderingMultiProc::test_raise_comms, test/distributed/test_compute_comm_reordering.py::TestComputeCommReorderingMultiProc::test_reorder_compute_for_overlap, test/distributed/test_compute_comm_reordering.py::TestComputeCommReorderingMultiProc::test_reorder_compute_for_overlap_custom_runtime_estimation, test/distributed/test_compute_comm_reordering.py::TestComputeCommReorderingMultiProc::test_sink_waits, test/distributed/test_compute_comm_reordering.py::TestComputeCommReorderingMultiProc::test_sink_waits_raise_comms 2024-12-18T03:55:52.0240168Z 2024-12-18T03:55:52.0240400Z Running distributed/_tensor/test_dtensor 1/1 ... [2024-12-18 03:55:52.022400] 2024-12-18T03:55:52.0240837Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T03:55:52.0241814Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_tensor/test_dtensor.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 03:55:52.023067] 2024-12-18T03:59:23.0397481Z 2024-12-18T03:59:23.0403216Z distributed/_tensor/test_dtensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tensor.test_dtensor_1.1_67eb5cb251718698_.log 2024-12-18T03:59:23.0425053Z Running 36 items in this shard: test/distributed/_tensor/test_dtensor.py::DTensorTest::test_dtensor_async_output, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_dtensor_constructor, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_dtensor_new_empty_strided, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_dtensor_properties, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_dtensor_save_load, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_dtensor_save_load_import, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_dtensor_spec_hash, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_dtensor_spec_read_only_after_set, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_dtensor_stride, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_from_local, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_from_local_negative_dim, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_from_local_then_to_local, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_from_local_uneven_sharding, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_from_local_uneven_sharding_raise_error, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_full_tensor_grad_hint, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_full_tensor_sync, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_meta_dtensor, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_modules_w_meta_dtensor, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_shard_tensor, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_shard_tensor_2d, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_to_local, test/distributed/_tensor/test_dtensor.py::DTensorTest::test_to_local_grad_hint, test/distributed/_tensor/test_dtensor.py::DTensorMeshTest::test_auto_implicit_replication, test/distributed/_tensor/test_dtensor.py::DTensorMeshTest::test_default_value_sub_mesh, test/distributed/_tensor/test_dtensor.py::DTensorMeshTest::test_device_mesh_nd, test/distributed/_tensor/test_dtensor.py::DTensorMeshTest::test_dtensor_2d_mesh, test/distributed/_tensor/test_dtensor.py::DTensorMeshTest::test_dtensor_api_device_mesh_context_manager, test/distributed/_tensor/test_dtensor.py::DTensorMeshTest::test_dtensor_device_mesh_device_conversion, test/distributed/_tensor/test_dtensor.py::DTensorMeshTest::test_dtensor_spec_local_shard_offset, test/distributed/_tensor/test_dtensor.py::DTensorMeshTest::test_from_local_sub_mesh, test/distributed/_tensor/test_dtensor.py::DTensorMeshTest::test_implicit_replication, test/distributed/_tensor/test_dtensor.py::DTensorMeshTest::test_implicit_replication_for_foreach_ops, test/distributed/_tensor/test_dtensor.py::DTensorMeshTest::test_metadata_consistency_check, test/distributed/_tensor/test_dtensor.py::DTensorMeshTest::test_redistribute_sub_mesh, test/distributed/_tensor/test_dtensor.py::TestDTensorPlacementTypes::test_split_tensor_1D, test/distributed/_tensor/test_dtensor.py::DTensorLogTest::test_dtensor_log 2024-12-18T03:59:23.0436346Z 2024-12-18T03:59:23.0436599Z Running distributed/optim/test_named_optimizer 1/1 ... [2024-12-18 03:59:23.040352] 2024-12-18T03:59:23.0437067Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T03:59:23.0438079Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/optim/test_named_optimizer.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 03:59:23.040927] 2024-12-18T03:59:26.9402071Z 2024-12-18T03:59:26.9403894Z distributed/optim/test_named_optimizer 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.optim.test_named_optimizer_1.1_9e5cf65566716eb7_.log 2024-12-18T03:59:26.9405139Z 2024-12-18T03:59:26.9406246Z Running distributed/test_fake_pg 1/1 ... [2024-12-18 03:59:26.940274] 2024-12-18T03:59:26.9407092Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T03:59:26.9413646Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_fake_pg.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 03:59:26.940856] 2024-12-18T03:59:37.9309671Z 2024-12-18T03:59:37.9311578Z distributed/test_fake_pg 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_fake_pg_1.1_632aae694b9b8873_.log 2024-12-18T03:59:37.9320207Z Running 13 items in this shard: test/distributed/test_fake_pg.py::TestFakePG::test_all_reduce, test/distributed/test_fake_pg.py::TestFakePG::test_allgather, test/distributed/test_fake_pg.py::TestFakePG::test_alltoall, test/distributed/test_fake_pg.py::TestFakePG::test_alltoall_base, test/distributed/test_fake_pg.py::TestFakePG::test_broadcast, test/distributed/test_fake_pg.py::TestFakePG::test_construct_fsdp, test/distributed/test_fake_pg.py::TestFakePG::test_fake_pg_tracing, test/distributed/test_fake_pg.py::TestFakePG::test_fsdp_fake_e2e, test/distributed/test_fake_pg.py::TestFakePG::test_fsdp_tp_fake_e2e, test/distributed/test_fake_pg.py::TestFakePG::test_recv, test/distributed/test_fake_pg.py::TestFakePG::test_reduce_scatter, test/distributed/test_fake_pg.py::TestFakePG::test_scatter, test/distributed/test_fake_pg.py::TestFakePG::test_send 2024-12-18T03:59:37.9325931Z 2024-12-18T03:59:37.9326455Z Running distributed/checkpoint/test_fsdp_optim_state 1/1 ... [2024-12-18 03:59:37.931365] 2024-12-18T03:59:37.9327329Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T03:59:37.9329226Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_fsdp_optim_state.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 03:59:37.932002] 2024-12-18T04:00:05.3569540Z 2024-12-18T04:00:05.3578604Z distributed/checkpoint/test_fsdp_optim_state 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_fsdp_optim_state_1.1_6f3142b7babf1095_.log 2024-12-18T04:00:05.3582260Z Running 2 items in this shard: test/distributed/checkpoint/test_fsdp_optim_state.py::FsdpOptimStateCheckpoint::test_load_sharded_optimizer_state_dict_pass_planner_False, test/distributed/checkpoint/test_fsdp_optim_state.py::FsdpOptimStateCheckpoint::test_load_sharded_optimizer_state_dict_pass_planner_True 2024-12-18T04:00:05.3584597Z 2024-12-18T04:00:05.3585152Z Running distributed/_tensor/experimental/test_local_map 1/1 ... [2024-12-18 04:00:05.357421] 2024-12-18T04:00:05.3586139Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:00:05.3588193Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_tensor/experimental/test_local_map.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:00:05.358022] 2024-12-18T04:00:55.5816363Z 2024-12-18T04:00:55.5818008Z distributed/_tensor/experimental/test_local_map 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tensor.experimental.test_local_map_1.1_ab88fa772f74ddff_.log 2024-12-18T04:00:55.5822234Z Running 4 items in this shard: test/distributed/_tensor/experimental/test_local_map.py::TestLocalMap::test_local_map_correctness, test/distributed/_tensor/experimental/test_local_map.py::TestLocalMap::test_local_map_in_placements, test/distributed/_tensor/experimental/test_local_map.py::TestLocalMap::test_local_map_out_placements, test/distributed/_tensor/experimental/test_local_map.py::TestLocalMap::test_local_map_redistribute 2024-12-18T04:00:55.5825449Z 2024-12-18T04:00:55.5826060Z Running distributed/_composable/test_checkpoint 1/1 ... [2024-12-18 04:00:55.582009] 2024-12-18T04:00:55.5827125Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:00:55.5831402Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/test_checkpoint.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:00:55.582624] 2024-12-18T04:01:06.1208049Z 2024-12-18T04:01:06.1209959Z distributed/_composable/test_checkpoint 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.test_checkpoint_1.1_0b3c7517cd3d7d9a_.log 2024-12-18T04:01:06.1216350Z Running 6 items in this shard: test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_checkpoint_kwargs, test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_clears_state_on_error_in_forward, test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_multi_args, test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_random_cpu, test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_tensor_only_cpu, test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_tensor_only_gpu 2024-12-18T04:01:06.1220710Z 2024-12-18T04:01:06.1221375Z Running distributed/checkpoint/test_dtensor_resharding 1/1 ... [2024-12-18 04:01:06.121100] 2024-12-18T04:01:06.1222485Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:01:06.1224911Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_dtensor_resharding.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:01:06.121693] 2024-12-18T04:01:45.6796074Z 2024-12-18T04:01:45.6801827Z distributed/checkpoint/test_dtensor_resharding 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_dtensor_resharding_1.1_b75f364da09e3cbc_.log 2024-12-18T04:01:45.6808240Z Running 5 items in this shard: test/distributed/checkpoint/test_dtensor_resharding.py::TestDTensorReshardPlacementChange::test_1d_to_1d_reshard_placement_change, test/distributed/checkpoint/test_dtensor_resharding.py::TestDTensorReshardPlacementChange::test_2d_to_2d_reshard_placement_change, test/distributed/checkpoint/test_dtensor_resharding.py::TestDTensorReshardMeshChange::test_1d_to_2d_reshard_mesh_change, test/distributed/checkpoint/test_dtensor_resharding.py::TestDTensorReshardMeshChange::test_2d_to_1d_reshard_mesh_change, test/distributed/checkpoint/test_dtensor_resharding.py::TestDTensorReshardMeshChange::test_dtensor_checkpoint_resharding_with_empty_shard 2024-12-18T04:01:45.6813695Z 2024-12-18T04:01:45.6814249Z Running distributed/test_distributed_spawn 1/12 ... [2024-12-18 04:01:45.679938] 2024-12-18T04:01:45.6827464Z MPI not available -- MPI backend tests will be skipped 2024-12-18T04:01:45.6828287Z Running distributed tests for the test backend with env init_method 2024-12-18T04:01:45.6833225Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:01:45.6834352Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=1', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:01:45.683131] 2024-12-18T04:01:50.8782773Z 2024-12-18T04:01:50.8784532Z distributed/test_distributed_spawn 1/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_1.12_ab0087a033ea1f6f_.log 2024-12-18T04:01:50.8786190Z Running 0 items in this shard: 2024-12-18T04:01:50.8786552Z 2024-12-18T04:01:50.8794203Z Running distributed tests for the test backend with file init_method 2024-12-18T04:01:50.8796785Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:01:50.8800408Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=1', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:01:50.879607] 2024-12-18T04:01:56.0490061Z 2024-12-18T04:01:56.0491924Z distributed/test_distributed_spawn 1/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_1.12_7ceeddf0d64aafb9_.log 2024-12-18T04:01:56.0493542Z Running 0 items in this shard: 2024-12-18T04:01:56.0495060Z 2024-12-18T04:01:56.0504523Z Running distributed tests for the nccl backend with env init_method 2024-12-18T04:01:56.0504973Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:01:56.0510140Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=1', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:01:56.050434] 2024-12-18T04:08:11.3993286Z 2024-12-18T04:08:11.3995132Z distributed/test_distributed_spawn 1/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_1.12_13b03c777d9ab561_.log 2024-12-18T04:08:11.4014721Z Running 25 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync_allreduce_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync_grad_is_view, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_full_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_with_logger, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_logging_data_cpu, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_execution_trace, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_shared_grad_acc_unused_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_invalid_static_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_failure_order, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_api_cpu 2024-12-18T04:08:11.4026498Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_requires_grad 2024-12-18T04:08:11.4027573Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync_allreduce_hook 2024-12-18T04:08:11.4028625Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync_grad_is_view 2024-12-18T04:08:11.4029580Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_cuda 2024-12-18T04:08:11.4030882Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda 2024-12-18T04:08:11.4032021Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_min 2024-12-18T04:08:11.4032960Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max 2024-12-18T04:08:11.4033934Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported 2024-12-18T04:08:11.4034895Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async 2024-12-18T04:08:11.4035772Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda_async 2024-12-18T04:08:11.4036694Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_full_group_cuda 2024-12-18T04:08:11.4037556Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_cuda 2024-12-18T04:08:11.4038432Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err 2024-12-18T04:08:11.4039342Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async 2024-12-18T04:08:11.4040346Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_with_logger 2024-12-18T04:08:11.4041454Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad 2024-12-18T04:08:11.4042431Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_logging_data_cpu 2024-12-18T04:08:11.4043329Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_execution_trace 2024-12-18T04:08:11.4044289Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_shared_grad_acc_unused_params 2024-12-18T04:08:11.4045192Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs 2024-12-18T04:08:11.4046017Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group 2024-12-18T04:08:11.4046847Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_invalid_static_graph 2024-12-18T04:08:11.4047746Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_failure_order 2024-12-18T04:08:11.4048637Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_cuda 2024-12-18T04:08:11.4049485Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_api_cpu 2024-12-18T04:08:11.4049967Z 2024-12-18T04:08:11.4050172Z Running distributed tests for the nccl backend with file init_method 2024-12-18T04:08:11.4050583Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:08:11.4051618Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=1', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:08:11.401924] 2024-12-18T04:12:47.2651895Z 2024-12-18T04:12:47.2658271Z distributed/test_distributed_spawn 1/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_1.12_f91e4fc6824fd889_.log 2024-12-18T04:12:47.2670659Z Running 25 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync_allreduce_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync_grad_is_view, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_full_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_with_logger, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_logging_data_cpu, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_execution_trace, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_shared_grad_acc_unused_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_invalid_static_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_failure_order, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_api_cpu 2024-12-18T04:12:47.2680841Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_requires_grad 2024-12-18T04:12:47.2681878Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync_allreduce_hook 2024-12-18T04:12:47.2682910Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync_grad_is_view 2024-12-18T04:12:47.2683855Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_cuda 2024-12-18T04:12:47.2684757Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda 2024-12-18T04:12:47.2685710Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_min 2024-12-18T04:12:47.2686628Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max 2024-12-18T04:12:47.2687597Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported 2024-12-18T04:12:47.2688708Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async 2024-12-18T04:12:47.2691649Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda_async 2024-12-18T04:12:47.2692625Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_full_group_cuda 2024-12-18T04:12:47.2693516Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_cuda 2024-12-18T04:12:47.2694392Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err 2024-12-18T04:12:47.2695399Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async 2024-12-18T04:12:47.2696419Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_with_logger 2024-12-18T04:12:47.2697535Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad 2024-12-18T04:12:47.2698512Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_logging_data_cpu 2024-12-18T04:12:47.2699407Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_execution_trace 2024-12-18T04:12:47.2700356Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_shared_grad_acc_unused_params 2024-12-18T04:12:47.2701267Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs 2024-12-18T04:12:47.2702204Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group 2024-12-18T04:12:47.2703181Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_invalid_static_graph 2024-12-18T04:12:47.2704109Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_failure_order 2024-12-18T04:12:47.2705002Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_cuda 2024-12-18T04:12:47.2705845Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_api_cpu 2024-12-18T04:12:47.2706316Z 2024-12-18T04:12:47.2706531Z Running distributed tests for the gloo backend with env init_method 2024-12-18T04:12:47.2706935Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:12:47.2707962Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=1', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:12:47.267840] 2024-12-18T04:17:20.1308403Z 2024-12-18T04:17:20.1310301Z distributed/test_distributed_spawn 1/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_1.12_6f152df26db32d69_.log 2024-12-18T04:17:20.1330227Z Running 25 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync_allreduce_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync_grad_is_view, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_full_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_with_logger, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_logging_data_cpu, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_execution_trace, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_shared_grad_acc_unused_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_invalid_static_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_failure_order, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_api_cpu 2024-12-18T04:17:20.1340822Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_requires_grad 2024-12-18T04:17:20.1342033Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync_allreduce_hook 2024-12-18T04:17:20.1343268Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync_grad_is_view 2024-12-18T04:17:20.1344218Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_cuda 2024-12-18T04:17:20.1345109Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda 2024-12-18T04:17:20.1346064Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_min 2024-12-18T04:17:20.1346988Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max 2024-12-18T04:17:20.1347963Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported 2024-12-18T04:17:20.1348931Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async 2024-12-18T04:17:20.1349815Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda_async 2024-12-18T04:17:20.1350724Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_full_group_cuda 2024-12-18T04:17:20.1351768Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_cuda 2024-12-18T04:17:20.1352636Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err 2024-12-18T04:17:20.1353940Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async 2024-12-18T04:17:20.1355488Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_with_logger 2024-12-18T04:17:20.1357186Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad 2024-12-18T04:17:20.1358678Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_logging_data_cpu 2024-12-18T04:17:20.1360044Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_execution_trace 2024-12-18T04:17:20.1361514Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_shared_grad_acc_unused_params 2024-12-18T04:17:20.1362909Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs 2024-12-18T04:17:20.1364370Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group 2024-12-18T04:17:20.1365864Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_invalid_static_graph 2024-12-18T04:17:20.1367292Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_failure_order 2024-12-18T04:17:20.1368652Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_cuda 2024-12-18T04:17:20.1369947Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_api_cpu 2024-12-18T04:17:20.1370683Z 2024-12-18T04:17:20.1370989Z Running distributed tests for the gloo backend with file init_method 2024-12-18T04:17:20.1371614Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:17:20.1373186Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=1', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:17:20.133727] 2024-12-18T04:21:52.3642191Z 2024-12-18T04:21:52.3647344Z distributed/test_distributed_spawn 1/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_1.12_b5a0c84213d2bfd0_.log 2024-12-18T04:21:52.3665781Z Running 25 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync_allreduce_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync_grad_is_view, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_full_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_with_logger, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_logging_data_cpu, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_execution_trace, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_shared_grad_acc_unused_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_invalid_static_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_failure_order, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_api_cpu 2024-12-18T04:21:52.3676256Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_requires_grad 2024-12-18T04:21:52.3677298Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync_allreduce_hook 2024-12-18T04:21:52.3678333Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync_grad_is_view 2024-12-18T04:21:52.3679278Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_cuda 2024-12-18T04:21:52.3680176Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda 2024-12-18T04:21:52.3681124Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_min 2024-12-18T04:21:52.3682050Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max 2024-12-18T04:21:52.3683018Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported 2024-12-18T04:21:52.3683984Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async 2024-12-18T04:21:52.3684869Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda_async 2024-12-18T04:21:52.3685776Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_full_group_cuda 2024-12-18T04:21:52.3686642Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_cuda 2024-12-18T04:21:52.3688218Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err 2024-12-18T04:21:52.3689989Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async 2024-12-18T04:21:52.3691955Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_with_logger 2024-12-18T04:21:52.3694401Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad 2024-12-18T04:21:52.3696546Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_logging_data_cpu 2024-12-18T04:21:52.3698258Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_execution_trace 2024-12-18T04:21:52.3699211Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_shared_grad_acc_unused_params 2024-12-18T04:21:52.3700122Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs 2024-12-18T04:21:52.3700942Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group 2024-12-18T04:21:52.3701781Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_invalid_static_graph 2024-12-18T04:21:52.3702686Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_failure_order 2024-12-18T04:21:52.3703575Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_cuda 2024-12-18T04:21:52.3704416Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_api_cpu 2024-12-18T04:21:52.3704886Z 2024-12-18T04:21:52.3705137Z Running distributed/test_distributed_spawn 4/12 ... [2024-12-18 04:21:52.366654] 2024-12-18T04:21:52.3705618Z MPI not available -- MPI backend tests will be skipped 2024-12-18T04:21:52.3706063Z Running distributed tests for the test backend with env init_method 2024-12-18T04:21:52.3706473Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:21:52.3707503Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=4', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:21:52.369130] 2024-12-18T04:21:57.5549226Z 2024-12-18T04:21:57.5551066Z distributed/test_distributed_spawn 4/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_4.12_f20a53004909df07_.log 2024-12-18T04:21:57.5552659Z Running 0 items in this shard: 2024-12-18T04:21:57.5552998Z 2024-12-18T04:21:57.5564952Z Running distributed tests for the test backend with file init_method 2024-12-18T04:21:57.5567860Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:21:57.5575236Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=4', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:21:57.556825] 2024-12-18T04:22:02.7499828Z 2024-12-18T04:22:02.7502186Z distributed/test_distributed_spawn 4/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_4.12_e0b1c6788b5f824a_.log 2024-12-18T04:22:02.7504214Z Running 0 items in this shard: 2024-12-18T04:22:02.7504653Z 2024-12-18T04:22:02.7512013Z Running distributed tests for the nccl backend with env init_method 2024-12-18T04:22:02.7512490Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:22:02.7517684Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=4', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:22:02.751381] 2024-12-18T04:25:25.4102438Z 2024-12-18T04:25:25.4109224Z distributed/test_distributed_spawn 4/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_4.12_08faaa98c1364f53_.log 2024-12-18T04:25:25.4121737Z Running 23 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallelCPU, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_No_Affine, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_gloo_tags, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_mixed_backend_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_input_join_disable, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs_stop_iteration_sync_bn, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_world_size_not_divisible_by_group_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_autograd_profiler 2024-12-18T04:25:25.4131518Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager 2024-12-18T04:25:25.4132483Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallelCPU 2024-12-18T04:25:25.4133581Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value 2024-12-18T04:25:25.4134838Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_No_Affine 2024-12-18T04:25:25.4135852Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min 2024-12-18T04:25:25.4136887Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_max 2024-12-18T04:25:25.4137907Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_product 2024-12-18T04:25:25.4138861Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_group 2024-12-18T04:25:25.4139743Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_gloo_tags 2024-12-18T04:25:25.4149947Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_mixed_backend_err 2024-12-18T04:25:25.4151103Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping 2024-12-18T04:25:25.4152031Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph 2024-12-18T04:25:25.4152911Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD 2024-12-18T04:25:25.4153841Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_input_join_disable 2024-12-18T04:25:25.4154821Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs_stop_iteration_sync_bn 2024-12-18T04:25:25.4155732Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank 2024-12-18T04:25:25.4156585Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce 2024-12-18T04:25:25.4157598Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_world_size_not_divisible_by_group_size 2024-12-18T04:25:25.4158567Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_product 2024-12-18T04:25:25.4159384Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum 2024-12-18T04:25:25.4160274Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_autograd_profiler 2024-12-18T04:25:25.4161284Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_torch_profiler 2024-12-18T04:25:25.4163160Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_autograd_profiler 2024-12-18T04:25:25.4164219Z 2024-12-18T04:25:25.4164646Z Running distributed tests for the nccl backend with file init_method 2024-12-18T04:25:25.4165460Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:25:25.4167478Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=4', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:25:25.412550] 2024-12-18T04:28:48.7143568Z 2024-12-18T04:28:48.7145443Z distributed/test_distributed_spawn 4/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_4.12_c75bcf68dfe14714_.log 2024-12-18T04:28:48.7157079Z Running 23 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallelCPU, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_No_Affine, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_gloo_tags, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_mixed_backend_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_input_join_disable, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs_stop_iteration_sync_bn, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_world_size_not_divisible_by_group_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_autograd_profiler 2024-12-18T04:28:48.7166924Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager 2024-12-18T04:28:48.7167887Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallelCPU 2024-12-18T04:28:48.7168979Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value 2024-12-18T04:28:48.7170147Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_No_Affine 2024-12-18T04:28:48.7171132Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min 2024-12-18T04:28:48.7172026Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_max 2024-12-18T04:28:48.7172884Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_product 2024-12-18T04:28:48.7173734Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_group 2024-12-18T04:28:48.7174686Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_gloo_tags 2024-12-18T04:28:48.7175641Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_mixed_backend_err 2024-12-18T04:28:48.7176627Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping 2024-12-18T04:28:48.7177570Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph 2024-12-18T04:28:48.7178601Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD 2024-12-18T04:28:48.7179628Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_input_join_disable 2024-12-18T04:28:48.7180606Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs_stop_iteration_sync_bn 2024-12-18T04:28:48.7181502Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank 2024-12-18T04:28:48.7182536Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce 2024-12-18T04:28:48.7183679Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_world_size_not_divisible_by_group_size 2024-12-18T04:28:48.7184643Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_product 2024-12-18T04:28:48.7185454Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum 2024-12-18T04:28:48.7186324Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_autograd_profiler 2024-12-18T04:28:48.7187266Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_torch_profiler 2024-12-18T04:28:48.7188221Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_autograd_profiler 2024-12-18T04:28:48.7188756Z 2024-12-18T04:28:48.7188960Z Running distributed tests for the gloo backend with env init_method 2024-12-18T04:28:48.7189369Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:28:48.7190388Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=4', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:28:48.715815] 2024-12-18T04:32:34.6339752Z 2024-12-18T04:32:34.6341267Z distributed/test_distributed_spawn 4/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_4.12_629b709a6d5bc186_.log 2024-12-18T04:32:34.6350805Z Running 23 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallelCPU, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_No_Affine, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_gloo_tags, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_mixed_backend_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_input_join_disable, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs_stop_iteration_sync_bn, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_world_size_not_divisible_by_group_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_autograd_profiler 2024-12-18T04:32:34.6358760Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager 2024-12-18T04:32:34.6359539Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallelCPU 2024-12-18T04:32:34.6360431Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value 2024-12-18T04:32:34.6361365Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_No_Affine 2024-12-18T04:32:34.6362159Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min 2024-12-18T04:32:34.6362882Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_max 2024-12-18T04:32:34.6363570Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_product 2024-12-18T04:32:34.6364251Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_group 2024-12-18T04:32:34.6364963Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_gloo_tags 2024-12-18T04:32:34.6365731Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_mixed_backend_err 2024-12-18T04:32:34.6366525Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping 2024-12-18T04:32:34.6367265Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph 2024-12-18T04:32:34.6367972Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD 2024-12-18T04:32:34.6368710Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_input_join_disable 2024-12-18T04:32:34.6369495Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs_stop_iteration_sync_bn 2024-12-18T04:32:34.6370220Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank 2024-12-18T04:32:34.6370946Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce 2024-12-18T04:32:34.6371906Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_world_size_not_divisible_by_group_size 2024-12-18T04:32:34.6372762Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_product 2024-12-18T04:32:34.6373415Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum 2024-12-18T04:32:34.6374119Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_autograd_profiler 2024-12-18T04:32:34.6374954Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_torch_profiler 2024-12-18T04:32:34.6375724Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_autograd_profiler 2024-12-18T04:32:34.6376383Z 2024-12-18T04:32:34.6376582Z Running distributed tests for the gloo backend with file init_method 2024-12-18T04:32:34.6376968Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:32:34.6378032Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=4', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:32:34.634802] 2024-12-18T04:36:20.5524431Z 2024-12-18T04:36:20.5525410Z distributed/test_distributed_spawn 4/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_4.12_d43c2f53eef60504_.log 2024-12-18T04:36:20.5534011Z Running 23 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallelCPU, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_No_Affine, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_gloo_tags, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_mixed_backend_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_input_join_disable, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs_stop_iteration_sync_bn, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_world_size_not_divisible_by_group_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_autograd_profiler 2024-12-18T04:36:20.5541774Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager 2024-12-18T04:36:20.5542543Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallelCPU 2024-12-18T04:36:20.5543423Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value 2024-12-18T04:36:20.5544358Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_No_Affine 2024-12-18T04:36:20.5545577Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min 2024-12-18T04:36:20.5546300Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_max 2024-12-18T04:36:20.5547176Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_product 2024-12-18T04:36:20.5547862Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_group 2024-12-18T04:36:20.5548569Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_gloo_tags 2024-12-18T04:36:20.5549335Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_mixed_backend_err 2024-12-18T04:36:20.5550123Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping 2024-12-18T04:36:20.5550865Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph 2024-12-18T04:36:20.5551573Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD 2024-12-18T04:36:20.5552321Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_input_join_disable 2024-12-18T04:36:20.5553100Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs_stop_iteration_sync_bn 2024-12-18T04:36:20.5553828Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank 2024-12-18T04:36:20.5554515Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce 2024-12-18T04:36:20.5555332Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_world_size_not_divisible_by_group_size 2024-12-18T04:36:20.5556126Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_product 2024-12-18T04:36:20.5556791Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum 2024-12-18T04:36:20.5557491Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_autograd_profiler 2024-12-18T04:36:20.5558251Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_torch_profiler 2024-12-18T04:36:20.5559018Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_autograd_profiler 2024-12-18T04:36:20.5559452Z 2024-12-18T04:36:20.5559644Z Running distributed/test_distributed_spawn 7/12 ... [2024-12-18 04:36:20.553332] 2024-12-18T04:36:20.5560038Z MPI not available -- MPI backend tests will be skipped 2024-12-18T04:36:20.5560399Z Running distributed tests for the test backend with env init_method 2024-12-18T04:36:20.5560731Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:36:20.5561562Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=7', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:36:20.554977] 2024-12-18T04:36:25.7230502Z 2024-12-18T04:36:25.7232566Z distributed/test_distributed_spawn 7/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_7.12_f945a14198ee167b_.log 2024-12-18T04:36:25.7235266Z Running 0 items in this shard: 2024-12-18T04:36:25.7235638Z 2024-12-18T04:36:25.7243727Z Running distributed tests for the test backend with file init_method 2024-12-18T04:36:25.7246635Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:36:25.7253667Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=7', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:36:25.724792] 2024-12-18T04:36:30.8735215Z 2024-12-18T04:36:30.8736719Z distributed/test_distributed_spawn 7/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_7.12_2fffa36323597a3d_.log 2024-12-18T04:36:30.8738439Z Running 0 items in this shard: 2024-12-18T04:36:30.8738807Z 2024-12-18T04:36:30.8741495Z Running distributed tests for the nccl backend with env init_method 2024-12-18T04:36:30.8742383Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:36:30.8744997Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=7', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:36:30.874152] 2024-12-18T04:39:55.8830835Z 2024-12-18T04:39:55.8832616Z distributed/test_distributed_spawn 7/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_7.12_ab1d3104c746ae9a_.log 2024-12-18T04:39:55.8855968Z Running 23 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_2D_Input, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_default_pg, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_compile_static_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_tuple_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_verify_model_across_rank_without_logger 2024-12-18T04:39:55.8865735Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class 2024-12-18T04:39:55.8866714Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_2D_Input 2024-12-18T04:39:55.8867933Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_default_pg 2024-12-18T04:39:55.8868911Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_max 2024-12-18T04:39:55.8869891Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_sum 2024-12-18T04:39:55.8870860Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_max 2024-12-18T04:39:55.8871787Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_max 2024-12-18T04:39:55.8872687Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_min 2024-12-18T04:39:55.8873556Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_max 2024-12-18T04:39:55.8874429Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split 2024-12-18T04:39:55.8875409Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex 2024-12-18T04:39:55.8876459Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group_cuda 2024-12-18T04:39:55.8877513Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group 2024-12-18T04:39:55.8878420Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast 2024-12-18T04:39:55.8879272Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list 2024-12-18T04:39:55.8880131Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager 2024-12-18T04:39:55.8880995Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_compile_static_graph 2024-12-18T04:39:55.8881845Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups 2024-12-18T04:39:55.8882736Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_tuple_module 2024-12-18T04:39:55.8883866Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_product 2024-12-18T04:39:55.8885529Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl 2024-12-18T04:39:55.8887219Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_torch_profiler 2024-12-18T04:39:55.8889073Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_verify_model_across_rank_without_logger 2024-12-18T04:39:55.8890156Z 2024-12-18T04:39:55.8890574Z Running distributed tests for the nccl backend with file init_method 2024-12-18T04:39:55.8891379Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:39:55.8893367Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=7', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:39:55.885634] 2024-12-18T04:43:21.2802374Z 2024-12-18T04:43:21.2803849Z distributed/test_distributed_spawn 7/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_7.12_792778ff50848fb0_.log 2024-12-18T04:43:21.2813292Z Running 23 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_2D_Input, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_default_pg, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_compile_static_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_tuple_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_verify_model_across_rank_without_logger 2024-12-18T04:43:21.2822911Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class 2024-12-18T04:43:21.2823916Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_2D_Input 2024-12-18T04:43:21.2824942Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_default_pg 2024-12-18T04:43:21.2825909Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_max 2024-12-18T04:43:21.2826889Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_sum 2024-12-18T04:43:21.2827853Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_max 2024-12-18T04:43:21.2828774Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_max 2024-12-18T04:43:21.2829908Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_min 2024-12-18T04:43:21.2830777Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_max 2024-12-18T04:43:21.2831803Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split 2024-12-18T04:43:21.2832787Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex 2024-12-18T04:43:21.2833866Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group_cuda 2024-12-18T04:43:21.2834922Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group 2024-12-18T04:43:21.2835844Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast 2024-12-18T04:43:21.2836704Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list 2024-12-18T04:43:21.2837578Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager 2024-12-18T04:43:21.2838456Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_compile_static_graph 2024-12-18T04:43:21.2839316Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups 2024-12-18T04:43:21.2840211Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_tuple_module 2024-12-18T04:43:21.2841170Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_product 2024-12-18T04:43:21.2842045Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl 2024-12-18T04:43:21.2842914Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_torch_profiler 2024-12-18T04:43:21.2843873Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_verify_model_across_rank_without_logger 2024-12-18T04:43:21.2844425Z 2024-12-18T04:43:21.2844629Z Running distributed tests for the gloo backend with env init_method 2024-12-18T04:43:21.2845032Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:43:21.2846065Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=7', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:43:21.281145] 2024-12-18T04:47:06.8625683Z 2024-12-18T04:47:06.8627330Z distributed/test_distributed_spawn 7/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_7.12_4b72e46009d72225_.log 2024-12-18T04:47:06.8647397Z Running 23 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_2D_Input, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_default_pg, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_compile_static_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_tuple_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_verify_model_across_rank_without_logger 2024-12-18T04:47:06.8657885Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class 2024-12-18T04:47:06.8658869Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_2D_Input 2024-12-18T04:47:06.8659881Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_default_pg 2024-12-18T04:47:06.8660848Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_max 2024-12-18T04:47:06.8661833Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_sum 2024-12-18T04:47:06.8662800Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_max 2024-12-18T04:47:06.8663726Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_max 2024-12-18T04:47:06.8664626Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_min 2024-12-18T04:47:06.8665495Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_max 2024-12-18T04:47:06.8666373Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split 2024-12-18T04:47:06.8667362Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex 2024-12-18T04:47:06.8668406Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group_cuda 2024-12-18T04:47:06.8669441Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group 2024-12-18T04:47:06.8670349Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast 2024-12-18T04:47:06.8671184Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list 2024-12-18T04:47:06.8672226Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager 2024-12-18T04:47:06.8673250Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_compile_static_graph 2024-12-18T04:47:06.8674118Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups 2024-12-18T04:47:06.8675018Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_tuple_module 2024-12-18T04:47:06.8675966Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_product 2024-12-18T04:47:06.8676828Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl 2024-12-18T04:47:06.8677688Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_torch_profiler 2024-12-18T04:47:06.8678636Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_verify_model_across_rank_without_logger 2024-12-18T04:47:06.8679189Z 2024-12-18T04:47:06.8679398Z Running distributed tests for the gloo backend with file init_method 2024-12-18T04:47:06.8679803Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:47:06.8680822Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=7', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:47:06.865062] 2024-12-18T04:50:52.1553626Z 2024-12-18T04:50:52.1555622Z distributed/test_distributed_spawn 7/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_7.12_5d6547776fb7d620_.log 2024-12-18T04:50:52.1575100Z Running 23 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_2D_Input, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_default_pg, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_compile_static_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_tuple_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_verify_model_across_rank_without_logger 2024-12-18T04:50:52.1584732Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class 2024-12-18T04:50:52.1585728Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_2D_Input 2024-12-18T04:50:52.1586768Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_default_pg 2024-12-18T04:50:52.1587739Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_max 2024-12-18T04:50:52.1588725Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_sum 2024-12-18T04:50:52.1589706Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_max 2024-12-18T04:50:52.1590640Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_max 2024-12-18T04:50:52.1591549Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_min 2024-12-18T04:50:52.1592412Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_max 2024-12-18T04:50:52.1593474Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split 2024-12-18T04:50:52.1594523Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex 2024-12-18T04:50:52.1595584Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group_cuda 2024-12-18T04:50:52.1596621Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group 2024-12-18T04:50:52.1597563Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast 2024-12-18T04:50:52.1598566Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list 2024-12-18T04:50:52.1599619Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager 2024-12-18T04:50:52.1600517Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_compile_static_graph 2024-12-18T04:50:52.1601381Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups 2024-12-18T04:50:52.1602282Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_tuple_module 2024-12-18T04:50:52.1603224Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_product 2024-12-18T04:50:52.1604089Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl 2024-12-18T04:50:52.1604942Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_torch_profiler 2024-12-18T04:50:52.1606038Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_verify_model_across_rank_without_logger 2024-12-18T04:50:52.1606589Z 2024-12-18T04:50:52.1606828Z Running distributed/test_distributed_spawn 10/12 ... [2024-12-18 04:50:52.157664] 2024-12-18T04:50:52.1607446Z MPI not available -- MPI backend tests will be skipped 2024-12-18T04:50:52.1607894Z Running distributed tests for the test backend with env init_method 2024-12-18T04:50:52.1608302Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:50:52.1609336Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=10', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:50:52.160270] 2024-12-18T04:50:57.3590387Z 2024-12-18T04:50:57.3591660Z distributed/test_distributed_spawn 10/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_10.12_b274bfc9aff18db0_.log 2024-12-18T04:50:57.3592536Z Running 0 items in this shard: 2024-12-18T04:50:57.3592738Z 2024-12-18T04:50:57.3598829Z Running distributed tests for the test backend with file init_method 2024-12-18T04:50:57.3601056Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:50:57.3606784Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=10', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:50:57.360327] 2024-12-18T04:51:02.5917047Z 2024-12-18T04:51:02.5919095Z distributed/test_distributed_spawn 10/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_10.12_e839fdf5000925ad_.log 2024-12-18T04:51:02.5920890Z Running 0 items in this shard: 2024-12-18T04:51:02.5921263Z 2024-12-18T04:51:02.5926980Z Running distributed tests for the nccl backend with env init_method 2024-12-18T04:51:02.5928697Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:51:02.5936223Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=10', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:51:02.593001] 2024-12-18T04:55:16.4792645Z 2024-12-18T04:55:16.4793622Z distributed/test_distributed_spawn 10/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_10.12_16f39f8737a29cbe_.log 2024-12-18T04:55:16.4804899Z Running 27 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_ignore_params_arg, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_python_error_logged, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_data_parallel_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_future, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_irecv, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_cuda 2024-12-18T04:55:16.4816188Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager 2024-12-18T04:55:16.4817386Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last 2024-12-18T04:55:16.4818418Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync 2024-12-18T04:55:16.4819354Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup 2024-12-18T04:55:16.4820315Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_min 2024-12-18T04:55:16.4821314Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_product 2024-12-18T04:55:16.4822245Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum 2024-12-18T04:55:16.4823108Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_full_group 2024-12-18T04:55:16.4824059Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group 2024-12-18T04:55:16.4825085Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group_cuda 2024-12-18T04:55:16.4825998Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier 2024-12-18T04:55:16.4826834Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group_cuda 2024-12-18T04:55:16.4827710Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_group 2024-12-18T04:55:16.4828600Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook 2024-12-18T04:55:16.4829486Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_ignore_params_arg 2024-12-18T04:55:16.4830603Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params 2024-12-18T04:55:16.4831552Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_python_error_logged 2024-12-18T04:55:16.4832577Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states 2024-12-18T04:55:16.4833440Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_full_group 2024-12-18T04:55:16.4834306Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_data_parallel_params 2024-12-18T04:55:16.4835144Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_future 2024-12-18T04:55:16.4835924Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_irecv 2024-12-18T04:55:16.4836749Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_autograd_profiler 2024-12-18T04:55:16.4837641Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo 2024-12-18T04:55:16.4838567Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup 2024-12-18T04:55:16.4839558Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min 2024-12-18T04:55:16.4840534Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_cuda 2024-12-18T04:55:16.4841045Z 2024-12-18T04:55:16.4841248Z Running distributed tests for the nccl backend with file init_method 2024-12-18T04:55:16.4841657Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:55:16.4842681Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=10', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:55:16.481674] 2024-12-18T04:59:30.6383650Z 2024-12-18T04:59:30.6386626Z distributed/test_distributed_spawn 10/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_10.12_9e34e722875bc47b_.log 2024-12-18T04:59:30.6395989Z Running 27 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_ignore_params_arg, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_python_error_logged, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_data_parallel_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_future, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_irecv, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_cuda 2024-12-18T04:59:30.6405112Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager 2024-12-18T04:59:30.6406108Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last 2024-12-18T04:59:30.6406943Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync 2024-12-18T04:59:30.6407697Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup 2024-12-18T04:59:30.6408474Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_min 2024-12-18T04:59:30.6409293Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_product 2024-12-18T04:59:30.6410043Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum 2024-12-18T04:59:30.6410737Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_full_group 2024-12-18T04:59:30.6411503Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group 2024-12-18T04:59:30.6412329Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group_cuda 2024-12-18T04:59:30.6413068Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier 2024-12-18T04:59:30.6413752Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group_cuda 2024-12-18T04:59:30.6414469Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_group 2024-12-18T04:59:30.6415454Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook 2024-12-18T04:59:30.6416184Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_ignore_params_arg 2024-12-18T04:59:30.6416954Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params 2024-12-18T04:59:30.6417893Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_python_error_logged 2024-12-18T04:59:30.6418604Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states 2024-12-18T04:59:30.6419424Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_full_group 2024-12-18T04:59:30.6420137Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_data_parallel_params 2024-12-18T04:59:30.6420822Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_future 2024-12-18T04:59:30.6421451Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_irecv 2024-12-18T04:59:30.6422117Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_autograd_profiler 2024-12-18T04:59:30.6422829Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo 2024-12-18T04:59:30.6423566Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup 2024-12-18T04:59:30.6424310Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min 2024-12-18T04:59:30.6424972Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_cuda 2024-12-18T04:59:30.6425340Z 2024-12-18T04:59:30.6425510Z Running distributed tests for the gloo backend with env init_method 2024-12-18T04:59:30.6425840Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T04:59:30.6426665Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=10', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 04:59:30.639599] 2024-12-18T05:04:20.1572667Z 2024-12-18T05:04:20.1574999Z distributed/test_distributed_spawn 10/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_10.12_c1451a2c72208b55_.log 2024-12-18T05:04:20.1595622Z Running 27 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_ignore_params_arg, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_python_error_logged, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_data_parallel_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_future, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_irecv, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_cuda 2024-12-18T05:04:20.1606790Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager 2024-12-18T05:04:20.1607996Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last 2024-12-18T05:04:20.1609026Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync 2024-12-18T05:04:20.1609950Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup 2024-12-18T05:04:20.1610895Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_min 2024-12-18T05:04:20.1611901Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_product 2024-12-18T05:04:20.1612833Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum 2024-12-18T05:04:20.1613691Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_full_group 2024-12-18T05:04:20.1614706Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group 2024-12-18T05:04:20.1615735Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group_cuda 2024-12-18T05:04:20.1616648Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier 2024-12-18T05:04:20.1617494Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group_cuda 2024-12-18T05:04:20.1619136Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_group 2024-12-18T05:04:20.1620868Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook 2024-12-18T05:04:20.1622600Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_ignore_params_arg 2024-12-18T05:04:20.1624461Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params 2024-12-18T05:04:20.1626341Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_python_error_logged 2024-12-18T05:04:20.1628358Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states 2024-12-18T05:04:20.1629247Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_full_group 2024-12-18T05:04:20.1630262Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_data_parallel_params 2024-12-18T05:04:20.1631116Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_future 2024-12-18T05:04:20.1631895Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_irecv 2024-12-18T05:04:20.1632722Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_autograd_profiler 2024-12-18T05:04:20.1633600Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo 2024-12-18T05:04:20.1634521Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup 2024-12-18T05:04:20.1635421Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min 2024-12-18T05:04:20.1636248Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_cuda 2024-12-18T05:04:20.1636695Z 2024-12-18T05:04:20.1636903Z Running distributed tests for the gloo backend with file init_method 2024-12-18T05:04:20.1637311Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:04:20.1639280Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=10', '--num-shards=12', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:04:20.159748] 2024-12-18T05:09:09.8519062Z 2024-12-18T05:09:09.8520032Z distributed/test_distributed_spawn 10/12 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_10.12_0b3035abcf10f2c7_.log 2024-12-18T05:09:09.8531334Z Running 27 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_ignore_params_arg, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_python_error_logged, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_data_parallel_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_future, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_irecv, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_cuda 2024-12-18T05:09:09.8543203Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager 2024-12-18T05:09:09.8544408Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last 2024-12-18T05:09:09.8545429Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_accumulate_gradients_no_sync 2024-12-18T05:09:09.8546379Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup 2024-12-18T05:09:09.8547324Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_min 2024-12-18T05:09:09.8548316Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_product 2024-12-18T05:09:09.8549245Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum 2024-12-18T05:09:09.8550098Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_full_group 2024-12-18T05:09:09.8551043Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group 2024-12-18T05:09:09.8552067Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group_cuda 2024-12-18T05:09:09.8552971Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier 2024-12-18T05:09:09.8553811Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group_cuda 2024-12-18T05:09:09.8554700Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_group 2024-12-18T05:09:09.8555585Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook 2024-12-18T05:09:09.8556468Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_ignore_params_arg 2024-12-18T05:09:09.8557416Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params 2024-12-18T05:09:09.8558372Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_python_error_logged 2024-12-18T05:09:09.8559261Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states 2024-12-18T05:09:09.8560114Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_full_group 2024-12-18T05:09:09.8561147Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_data_parallel_params 2024-12-18T05:09:09.8562125Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_future 2024-12-18T05:09:09.8562910Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_irecv 2024-12-18T05:09:09.8563727Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_autograd_profiler 2024-12-18T05:09:09.8564612Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo 2024-12-18T05:09:09.8565685Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup 2024-12-18T05:09:09.8566738Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min 2024-12-18T05:09:09.8567552Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_cuda 2024-12-18T05:09:09.8568001Z 2024-12-18T05:09:09.8568243Z Running distributed/_tensor/test_redistribute 1/1 ... [2024-12-18 05:09:09.853666] 2024-12-18T05:09:09.8568689Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:09:09.8569673Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_tensor/test_redistribute.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:09:09.854242] 2024-12-18T05:10:24.8203356Z 2024-12-18T05:10:24.8204824Z distributed/_tensor/test_redistribute 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tensor.test_redistribute_1.1_310a675ba5e2676e_.log 2024-12-18T05:10:24.8214138Z Running 13 items in this shard: test/distributed/_tensor/test_redistribute.py::RedistributeTest::test_partial_to_replicate_forward_backward, test/distributed/_tensor/test_redistribute.py::RedistributeTest::test_partial_to_shard, test/distributed/_tensor/test_redistribute.py::RedistributeTest::test_redistribute_negative_shard_dim, test/distributed/_tensor/test_redistribute.py::RedistributeTest::test_redistribute_shard_dim_change, test/distributed/_tensor/test_redistribute.py::RedistributeTest::test_redistribute_uneven_sharding, test/distributed/_tensor/test_redistribute.py::RedistributeTest::test_replicate_to_local_partial_grad, test/distributed/_tensor/test_redistribute.py::RedistributeTest::test_replicate_to_partial, test/distributed/_tensor/test_redistribute.py::RedistributeTest::test_replicate_to_replicate_forward_backward, test/distributed/_tensor/test_redistribute.py::RedistributeTest::test_replicate_to_shard_forward_backward, test/distributed/_tensor/test_redistribute.py::RedistributeTest::test_shard_dim_alltoall, test/distributed/_tensor/test_redistribute.py::RedistributeTest::test_shard_to_replicate_forward_backward, test/distributed/_tensor/test_redistribute.py::MultiDimRedistributeTest::test_multi_dim_mesh, test/distributed/_tensor/test_redistribute.py::MultiDimRedistributeTest::test_redistribute_shard_dim_multi_dim_mesh 2024-12-18T05:10:24.8221157Z 2024-12-18T05:10:24.8229558Z Running distributed/_tools/test_fsdp2_mem_tracker 1/1 ... [2024-12-18 05:10:24.822539] 2024-12-18T05:10:24.8230489Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:10:24.8236813Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_tools/test_fsdp2_mem_tracker.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:10:24.823114] 2024-12-18T05:10:30.1998843Z 2024-12-18T05:10:30.2000463Z distributed/_tools/test_fsdp2_mem_tracker 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tools.test_fsdp2_mem_tracker_1.1_266d150672139a34_.log 2024-12-18T05:10:30.2003065Z Running 3 items in this shard: test/distributed/_tools/test_fsdp2_mem_tracker.py::TestTrackerFullyShard1DTrainingCore::test_tracker_multi_group_eager, test/distributed/_tools/test_fsdp2_mem_tracker.py::TestTrackerFullyShard1DTrainingCore::test_tracker_non_root_forward_backward, test/distributed/_tools/test_fsdp2_mem_tracker.py::TestTrackerFullyShard1DTrainingCompose::test_tracker_with_activation_checkpointing 2024-12-18T05:10:30.2006563Z 2024-12-18T05:10:30.2007146Z Running distributed/checkpoint/e2e/test_e2e_save_and_load 1/1 ... [2024-12-18 05:10:30.199986] 2024-12-18T05:10:30.2008096Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:10:30.2010124Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/e2e/test_e2e_save_and_load.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:10:30.200341] 2024-12-18T05:11:52.3864087Z 2024-12-18T05:11:52.3866220Z distributed/checkpoint/e2e/test_e2e_save_and_load 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.e2e.test_e2e_save_and_load_1.1_cd9a4baa8b82ee90_.log 2024-12-18T05:11:52.3878786Z Running 15 items in this shard: test/distributed/checkpoint/e2e/test_e2e_save_and_load.py::TestE2ESaveAndLoad::test_different_ordered_state_dict_keys, test/distributed/checkpoint/e2e/test_e2e_save_and_load.py::TestE2ESaveAndLoad::test_e2e_async_cached_cache_staged_state_dict_False, test/distributed/checkpoint/e2e/test_e2e_save_and_load.py::TestE2ESaveAndLoad::test_e2e_async_cached_cache_staged_state_dict_True, test/distributed/checkpoint/e2e/test_e2e_save_and_load.py::TestE2ESaveAndLoad::test_e2e_compile_False_model_type0, test/distributed/checkpoint/e2e/test_e2e_save_and_load.py::TestE2ESaveAndLoad::test_e2e_compile_False_model_type1, test/distributed/checkpoint/e2e/test_e2e_save_and_load.py::TestE2ESaveAndLoad::test_e2e_compile_False_model_type2, test/distributed/checkpoint/e2e/test_e2e_save_and_load.py::TestE2ESaveAndLoad::test_e2e_compile_True_model_type0, test/distributed/checkpoint/e2e/test_e2e_save_and_load.py::TestE2ESaveAndLoad::test_e2e_compile_True_model_type1, test/distributed/checkpoint/e2e/test_e2e_save_and_load.py::TestE2ESaveAndLoad::test_e2e_compile_True_model_type2, test/distributed/checkpoint/e2e/test_e2e_save_and_load.py::TestE2ESaveAndLoad::test_no_dist, test/distributed/checkpoint/e2e/test_e2e_save_and_load.py::TestE2ESaveAndLoad::test_overwrite, test/distributed/checkpoint/e2e/test_e2e_save_and_load.py::TestE2ESaveAndLoad::test_partial_load, test/distributed/checkpoint/e2e/test_e2e_save_and_load.py::TestE2ESaveAndLoad::test_stateful_and_non_stateful_loads, test/distributed/checkpoint/e2e/test_e2e_save_and_load.py::TestNoCPU::test_no_cpu, test/distributed/checkpoint/e2e/test_e2e_save_and_load.py::TestInitStateDict::test_init_state_dict 2024-12-18T05:11:52.3884823Z 2024-12-18T05:11:52.3885096Z Running distributed/checkpoint/test_format_utils 1/1 ... [2024-12-18 05:11:52.386607] 2024-12-18T05:11:52.3885569Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:11:52.3886587Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_format_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:11:52.387241] 2024-12-18T05:12:14.7500400Z 2024-12-18T05:12:14.7502275Z distributed/checkpoint/test_format_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_format_utils_1.1_f164afe0404a98e9_.log 2024-12-18T05:12:14.7505879Z Running 3 items in this shard: test/distributed/checkpoint/test_format_utils.py::TestFormatUtils::test_dcp_to_torch_save, test/distributed/checkpoint/test_format_utils.py::TestFormatUtils::test_online_torch_save_to_dcp, test/distributed/checkpoint/test_format_utils.py::TestFormatUtils::test_torch_save_to_dcp 2024-12-18T05:12:14.7509723Z 2024-12-18T05:12:14.7515066Z Running distributed/checkpoint/e2e/test_fine_tuning 1/1 ... [2024-12-18 05:12:14.750398] 2024-12-18T05:12:14.7515898Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:12:14.7517143Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/e2e/test_fine_tuning.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:12:14.751038] 2024-12-18T05:12:25.0890490Z 2024-12-18T05:12:25.0892409Z distributed/checkpoint/e2e/test_fine_tuning 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.e2e.test_fine_tuning_1.1_d251d56c5c806aae_.log 2024-12-18T05:12:25.0894919Z Running 1 items in this shard: test/distributed/checkpoint/e2e/test_fine_tuning.py::TestFineTuning::test_fine_tuning 2024-12-18T05:12:25.0895863Z 2024-12-18T05:12:25.0897390Z Running distributed/_tensor/experimental/test_tp_transform 1/1 ... [2024-12-18 05:12:25.089301] 2024-12-18T05:12:25.0898378Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:12:25.0905587Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_tensor/experimental/test_tp_transform.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:12:25.089900] 2024-12-18T05:12:54.3199720Z 2024-12-18T05:12:54.3201587Z distributed/_tensor/experimental/test_tp_transform 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tensor.experimental.test_tp_transform_1.1_dae2117dbd72297d_.log 2024-12-18T05:12:54.3207078Z Running 3 items in this shard: test/distributed/_tensor/experimental/test_tp_transform.py::TensorParallelTest::test_tp_transform_e2e, test/distributed/_tensor/experimental/test_tp_transform.py::TensorParallelTest::test_tp_transform_no_bias, test/distributed/_tensor/experimental/test_tp_transform.py::TensorParallelTest::test_tp_transform_with_uncovered_op 2024-12-18T05:12:54.3210214Z 2024-12-18T05:12:54.3210723Z Running distributed/checkpoint/test_traverse 1/1 ... [2024-12-18 05:12:54.320324] 2024-12-18T05:12:54.3211273Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:12:54.3214721Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_traverse.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:12:54.320946] 2024-12-18T05:12:59.6983754Z 2024-12-18T05:12:59.6985372Z distributed/checkpoint/test_traverse 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_traverse_1.1_cbfceb284b918940_.log 2024-12-18T05:12:59.6990316Z Running 7 items in this shard: test/distributed/checkpoint/test_traverse.py::TestTraverse::test_get_element, test/distributed/checkpoint/test_traverse.py::TestTraverse::test_set_element, test/distributed/checkpoint/test_traverse.py::TestTraverse::test_traverse_doesnt_ignore_intermediate_collections, test/distributed/checkpoint/test_traverse.py::TestTraverse::test_traverse_nested_dict, test/distributed/checkpoint/test_traverse.py::TestTraverse::test_traverse_nested_list, test/distributed/checkpoint/test_traverse.py::TestTraverse::test_traverse_shallow, test/distributed/checkpoint/test_traverse.py::TestTraverse::test_traverse_with_ordered_dict 2024-12-18T05:12:59.6992702Z 2024-12-18T05:12:59.6992994Z Running distributed/tensor/parallel/test_tp_random_state 1/1 ... [2024-12-18 05:12:59.698581] 2024-12-18T05:12:59.6993490Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:12:59.6994526Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/parallel/test_tp_random_state.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:12:59.699081] 2024-12-18T05:13:09.9861862Z 2024-12-18T05:13:09.9865041Z distributed/tensor/parallel/test_tp_random_state 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.parallel.test_tp_random_state_1.1_665adf248430da2d_.log 2024-12-18T05:13:09.9867657Z Running 1 items in this shard: test/distributed/tensor/parallel/test_tp_random_state.py::TensorParallelRandomStateTests::test_model_init 2024-12-18T05:13:09.9868723Z 2024-12-18T05:13:09.9869288Z Running distributed/elastic/test_control_plane 1/1 ... [2024-12-18 05:13:09.986452] 2024-12-18T05:13:09.9870176Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:13:09.9875819Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/elastic/test_control_plane.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:13:09.987009] 2024-12-18T05:13:21.2907792Z 2024-12-18T05:13:21.2909818Z distributed/elastic/test_control_plane 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.test_control_plane_1.1_fd1754d1686a48a0_.log 2024-12-18T05:13:21.2917174Z Running 9 items in this shard: test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_dump_nccl_trace_pickle, test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_dump_nccl_trace_pickle_with_json, test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_dump_nccl_trace_pickle_with_params, test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_dump_traceback, test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_get_handler_names, test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_get_handler_nonexistant, test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_run_handler, test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_tcp, test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_worker_server 2024-12-18T05:13:21.2923645Z 2024-12-18T05:13:21.2924259Z Running distributed/_composable/test_replicate_with_compiler 1/1 ... [2024-12-18 05:13:21.290941] 2024-12-18T05:13:21.2925257Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:13:21.2927329Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/test_replicate_with_compiler.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:13:21.291265] 2024-12-18T05:15:05.9860983Z 2024-12-18T05:15:05.9862113Z distributed/_composable/test_replicate_with_compiler 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.test_replicate_with_compiler_1.1_c104570ce00d6aff_.log 2024-12-18T05:15:05.9866876Z Running 10 items in this shard: test/distributed/_composable/test_replicate_with_compiler.py::ReplicateTest::test_bucketing_coalesced_op, test/distributed/_composable/test_replicate_with_compiler.py::ReplicateTest::test_bucketing_concat_op, test/distributed/_composable/test_replicate_with_compiler.py::ReplicateTest::test_compile_backward_only, test/distributed/_composable/test_replicate_with_compiler.py::ReplicateTest::test_compile_bf16, test/distributed/_composable/test_replicate_with_compiler.py::ReplicateTest::test_compile_cpu, test/distributed/_composable/test_replicate_with_compiler.py::ReplicateTest::test_compile_cpu_no_sync, test/distributed/_composable/test_replicate_with_compiler.py::ReplicateTest::test_compile_fp16, test/distributed/_composable/test_replicate_with_compiler.py::ReplicateTest::test_compile_gpu, test/distributed/_composable/test_replicate_with_compiler.py::ReplicateTest::test_compile_gpu_ac, test/distributed/_composable/test_replicate_with_compiler.py::DDP_TP_Test::test_ddp_tp 2024-12-18T05:15:05.9870946Z 2024-12-18T05:15:05.9871151Z Running distributed/test_nccl 1/1 ... [2024-12-18 05:15:05.986126] 2024-12-18T05:15:05.9871544Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:15:05.9872691Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_nccl.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:15:05.986394] 2024-12-18T05:15:12.5652601Z 2024-12-18T05:15:12.5653838Z distributed/test_nccl 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_nccl_1.1_75328199ce8d2522_.log 2024-12-18T05:15:12.5662701Z Running 14 items in this shard: test/distributed/test_nccl.py::TestNCCLCUDA::test_all_gather_cuda_bfloat16, test/distributed/test_nccl.py::TestNCCLCUDA::test_all_gather_cuda_float32, test/distributed/test_nccl.py::TestNCCLCUDA::test_all_reduce_cuda_bfloat16, test/distributed/test_nccl.py::TestNCCLCUDA::test_all_reduce_cuda_float32, test/distributed/test_nccl.py::TestNCCLCUDA::test_broadcast_cuda_bfloat16, test/distributed/test_nccl.py::TestNCCLCUDA::test_broadcast_cuda_float32, test/distributed/test_nccl.py::TestNCCLCUDA::test_broadcast_cuda_float8_e4m3fnuz, test/distributed/test_nccl.py::TestNCCLCUDA::test_broadcast_cuda_float8_e5m2fnuz, test/distributed/test_nccl.py::TestNCCLCUDA::test_collective_errors_cuda, test/distributed/test_nccl.py::TestNCCLCUDA::test_reduce_cuda_bfloat16, test/distributed/test_nccl.py::TestNCCLCUDA::test_reduce_cuda_float32, test/distributed/test_nccl.py::TestNCCLCUDA::test_reduce_scatter_cuda_bfloat16, test/distributed/test_nccl.py::TestNCCLCUDA::test_reduce_scatter_cuda_float32, test/distributed/test_nccl.py::TestNCCLCUDA::test_unique_id_cuda 2024-12-18T05:15:12.5670573Z 2024-12-18T05:15:12.5670808Z Running distributed/test_functional_api 1/1 ... [2024-12-18 05:15:12.565524] 2024-12-18T05:15:12.5671242Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:15:12.5672218Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_functional_api.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:15:12.566057] 2024-12-18T05:16:54.5263535Z 2024-12-18T05:16:54.5264498Z distributed/test_functional_api 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_functional_api_1.1_e4716c9850673e45_.log 2024-12-18T05:16:54.5273799Z Running 18 items in this shard: test/distributed/test_functional_api.py::TestMetaCollectives::test_all_reduce, test/distributed/test_functional_api.py::TestMakeFx::test_all_reduce_tracing, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_all_gather_into_tensor_coalesced_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_all_to_all_single_1d_input_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_all_to_all_single_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_all_to_all_single_split_sizes_none_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_tracing_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_tracing_with_dce_code_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_tracing_with_fakepg_cuda, test/distributed/test_functional_api.py::TestDistributedBackendCollectivesWithWorldSize4CUDA::test_all_gather_into_tensor_coalesced, test/distributed/test_functional_api.py::TestDistributedBackendCollectivesWithWorldSize4CUDA::test_all_to_all_single, test/distributed/test_functional_api.py::TestDistributedBackendCollectivesWithWorldSize4CUDA::test_all_to_all_single_1d_input, test/distributed/test_functional_api.py::TestDistributedBackendCollectivesWithWorldSize4CUDA::test_all_to_all_single_split_sizes_none, test/distributed/test_functional_api.py::TestDistributedBackendCollectivesWithWorldSize4CUDA::test_permute_tensor_with_sub_group_cuda, test/distributed/test_functional_api.py::TestDistributedBackendCollectivesWithWorldSize4CUDA::test_tracing, test/distributed/test_functional_api.py::TestDistributedBackendCollectivesWithWorldSize4CUDA::test_tracing_with_dce_code, test/distributed/test_functional_api.py::TestDistributedBackendCollectivesWithWorldSize4CUDA::test_tracing_with_fakepg, test/distributed/test_functional_api.py::TestFunctionalAutogradWithDistributedBackendCUDA::test_all_to_all_single_cuda 2024-12-18T05:16:54.5282399Z 2024-12-18T05:16:54.5282695Z Running distributed/optim/test_apply_optimizer_in_backward 1/1 ... [2024-12-18 05:16:54.526557] 2024-12-18T05:16:54.5283194Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:16:54.5284243Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/optim/test_apply_optimizer_in_backward.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:16:54.526881] 2024-12-18T05:16:58.5071874Z 2024-12-18T05:16:58.5074119Z distributed/optim/test_apply_optimizer_in_backward 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.optim.test_apply_optimizer_in_backward_1.1_d5fda68149e5b847_.log 2024-12-18T05:16:58.5075732Z 2024-12-18T05:16:58.5079430Z Running distributed/fsdp/test_fsdp_state_dict 2/3 ... [2024-12-18 05:16:58.507392] 2024-12-18T05:16:58.5080499Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:16:58.5084597Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_state_dict.py', '--shard-id=2', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:16:58.507929] 2024-12-18T05:23:08.7053090Z 2024-12-18T05:23:08.7055028Z distributed/fsdp/test_fsdp_state_dict 2/3 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_state_dict_2.3_2bd85ef44022909f_.log 2024-12-18T05:23:08.7120658Z Running 49 items in this shard: test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_mixed_precision_False_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_mixed_precision_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_mixed_precision_True_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_mixed_precision_True_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_mixed_precision_False_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_mixed_precision_True_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_mixed_precision_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_mixed_precision_True_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_mixed_precision_False_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_mixed_precision_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_mixed_precision_False_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_mixed_precision_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_mixed_precision_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_mixed_precision_True_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_mixed_precision_True_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_keys_state_dict_type_state_dict, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_both_rank0_only_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_source_after_wrap_rank0_only_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_source_rank0_only_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_both_rank0_only_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_save_and_load_after_forward_state_dict_state_dict_type_state_dict_mixed_precision_True_state_dict_rank0_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_sharded_load_multi_backend_pg, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_rank0_offload_save_load_flow_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_save_load_flow_state_dict_type_local_state_dict, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_skip_module_state_dict_type_local_state_dict_double_nest_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_skip_module_state_dict_type_state_dict_double_nest_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_False_ignore_inner_True_mixed_precision_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_True_ignore_inner_False_mixed_precision_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_False_ignore_inner_True_mixed_precision_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_False_ignore_inner_True_mixed_precision_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_True_ignore_inner_False_mixed_precision_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_manual_ac_wrapper_state_dict_type_state_dict_rank0_only_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_shared_parameters_state_dict_type_local_state_dict, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_torch_save_load 2024-12-18T05:23:08.7184576Z 2024-12-18T05:23:08.7185207Z Running distributed/_composable/fsdp/test_fully_shard_grad_scaler 1/1 ... [2024-12-18 05:23:08.706679] 2024-12-18T05:23:08.7186557Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:23:08.7188908Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_grad_scaler.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:23:08.707308] 2024-12-18T05:23:20.1477675Z 2024-12-18T05:23:20.1478636Z distributed/_composable/fsdp/test_fully_shard_grad_scaler 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_grad_scaler_1.1_d1c986aea4e181bb_.log 2024-12-18T05:23:20.1479753Z Running 1 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_grad_scaler.py::TestFullyShardGradientScaler::test_gradient_scaler 2024-12-18T05:23:20.1480211Z 2024-12-18T05:23:20.1484628Z Running distributed/checkpoint/test_utils 1/1 ... [2024-12-18 05:23:20.148095] 2024-12-18T05:23:20.1485032Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:23:20.1490704Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:23:20.148649] 2024-12-18T05:23:25.4258485Z 2024-12-18T05:23:25.4260604Z distributed/checkpoint/test_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_utils_1.1_29fbd48fe7f6a0b0_.log 2024-12-18T05:23:25.4265766Z Running 6 items in this shard: test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_dcp_logger, test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_flat_data, test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_index_hint_ignored_on_equals, test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_index_hint_ignored_on_hash, test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_init_convert_offset, test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_sharded_tensor_lookup 2024-12-18T05:23:25.4267851Z 2024-12-18T05:23:25.4268084Z Running distributed/_tensor/test_utils 1/1 ... [2024-12-18 05:23:25.426079] 2024-12-18T05:23:25.4268528Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:23:25.4271302Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_tensor/test_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:23:25.426675] 2024-12-18T05:24:24.5069736Z 2024-12-18T05:24:24.5071436Z distributed/_tensor/test_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tensor.test_utils_1.1_988a683aa68310db_.log 2024-12-18T05:24:24.5079396Z Running 10 items in this shard: test/distributed/_tensor/test_utils.py::UtilTest::test_compute_local_shape_and_global_offset_1D, test/distributed/_tensor/test_utils.py::UtilTest::test_compute_local_shape_and_global_offset_2D, test/distributed/_tensor/test_utils.py::UtilTest::test_fsdp_tp_meta_compute, test/distributed/_tensor/test_utils.py::UtilTest::test_hsdp_tp_meta_compute, test/distributed/_tensor/test_utils.py::UtilTest::test_strided_sharding_assumption_in_meta_compute, test/distributed/_tensor/test_utils.py::TestStridedSharding::test_1d_mesh_strided_sharding, test/distributed/_tensor/test_utils.py::TestStridedSharding::test_2d_mesh_2d_tensor_strided_sharding, test/distributed/_tensor/test_utils.py::TestStridedSharding::test_2d_mesh_strided_sharding, test/distributed/_tensor/test_utils.py::Test2DStridedLocalShard::test_fsdp1_tp_2d_dtensor_local_shards_and_offsets, test/distributed/_tensor/test_utils.py::Test2DStridedLocalShard::test_fsdp2_tp_2d_dtensor_local_shards_and_offsets 2024-12-18T05:24:24.5087168Z 2024-12-18T05:24:24.5087602Z Running distributed/test_c10d_nccl 2/3 ... [2024-12-18 05:24:24.507413] 2024-12-18T05:24:24.5088428Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:24:24.5090773Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_c10d_nccl.py', '--shard-id=2', '--num-shards=3', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:24:24.507998] 2024-12-18T05:38:53.8247949Z 2024-12-18T05:38:53.8249370Z distributed/test_c10d_nccl 2/3 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_nccl_2.3_8c9bad48b420ceb0_.log 2024-12-18T05:38:53.8289481Z Running 74 items in this shard: test/distributed/test_c10d_nccl.py::TimeoutTest::test_default_store_timeout_nccl, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_abort_in_destroy_pg, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_comm_split_subgroup, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_destruct_before_terminate_pg, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_extend_nccl_pg_timeout_backend0, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float16, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float64, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float8_e4m3fn, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_check, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_rank_filter, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_new_group_eager_init_False, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_non_blocking_p2p, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_set_nccl_pg_timeout_backend0, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_set_process_group_desc, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_subgroup_p2p_eager_init_True, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_accumulate_gradients_module_with_grad_is_view, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_arbitrary_forward_return_value, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_bf16_compress_wrapper_nccl, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_builtin_ddp_comm_hooks_nccl_grad_is_view, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_module, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_False, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_weight_sharing_use_reentrant_False, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_allreduce_hook_nccl_grad_is_view, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_allreduce_hook_nccl_static_graph, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_gpu_nccl, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_multi_device_module_config, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_weight_sharing, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_default_ddp_comm_hooks_nccl, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_find_unused_parameters_kwarg_debug_detail, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_grad_layout_2devicemodule, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_invalid_powerSGD_state, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_multiple_outputs_multiple_backward, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_1gpu_module_device_ids_integer_list, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_1gpu_module_device_ids_torch_device_list, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_multi_device_module_device_ids_None, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_pass_default_pg, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_powerSGD_ddp_comm_hook_nccl_grad_is_view, test/distributed/test_c10d_nccl.py::WorkHookTest::test_on_completion_hook_mixed_ops, test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_nccl_errors_blocking_clean_exit, test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_nccl_errors_blocking_sigkill, test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_nccl_errors_nonblocking, test/distributed/test_c10d_nccl.py::CommTest::test_all_reduce_coalesced_manager_nccl, test/distributed/test_c10d_nccl.py::CommTest::test_all_reduce_coalesced_nccl, test/distributed/test_c10d_nccl.py::CommTest::test_broadcast_coalesced_nccl, test/distributed/test_c10d_nccl.py::CommTest::test_nccl_barrier, test/distributed/test_c10d_nccl.py::CommTest::test_nccl_barrier_device_ids, test/distributed/test_c10d_nccl.py::CommTest::test_nccl_warn_not_in_group_debug_off, test/distributed/test_c10d_nccl.py::CommTest::test_nncl_rank_membership, test/distributed/test_c10d_nccl.py::CommTest::test_pass_nccl_options_high_priority_stream, test/distributed/test_c10d_nccl.py::CommTest::test_reduce_scatter_base_k, test/distributed/test_c10d_nccl.py::CommTest::test_reduce_scatter_base_k_float8_errors, test/distributed/test_c10d_nccl.py::CommTest::test_unwaited, test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_collectives, test/distributed/test_c10d_nccl.py::LargeCommTest::test_batch_send_recv_subgroup_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_object_list_subgroup_set_device0_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_subgroup_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_subgroup_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_object_subgroup_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_subgroup_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_new_group_local_sync, test/distributed/test_c10d_nccl.py::LargeCommTest::test_scatter_object_list_subgroup_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device0_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device1_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_subgroup_group_rank_True_async_op_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce0_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce1_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce1_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes0_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes0_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_False_include_collectives_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_True_include_collectives_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_pickle_timing_enabled_False_include_collectives_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_trace_while_active_timing_enabled_True_only_active_False, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLLargerScaleTest::test_comm_recursive_split_group 2024-12-18T05:38:53.8340216Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::TimeoutTest::test_default_store_timeout_nccl 2024-12-18T05:38:53.8342158Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_abort_in_destroy_pg 2024-12-18T05:38:53.8344007Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_comm_split_subgroup 2024-12-18T05:38:53.8346392Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_destruct_before_terminate_pg 2024-12-18T05:38:53.8348040Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_extend_nccl_pg_timeout_backend0 2024-12-18T05:38:53.8348909Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float16 2024-12-18T05:38:53.8349695Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float64 2024-12-18T05:38:53.8350511Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float8_e4m3fn 2024-12-18T05:38:53.8351303Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_check 2024-12-18T05:38:53.8352050Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_rank_filter 2024-12-18T05:38:53.8352887Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_new_group_eager_init_False 2024-12-18T05:38:53.8354375Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_non_blocking_p2p 2024-12-18T05:38:53.8356197Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_set_nccl_pg_timeout_backend0 2024-12-18T05:38:53.8357836Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_set_process_group_desc 2024-12-18T05:38:53.8359478Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_subgroup_p2p_eager_init_True 2024-12-18T05:38:53.8361309Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_accumulate_gradients_module_with_grad_is_view 2024-12-18T05:38:53.8363184Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_arbitrary_forward_return_value 2024-12-18T05:38:53.8364911Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_bf16_compress_wrapper_nccl 2024-12-18T05:38:53.8366689Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_builtin_ddp_comm_hooks_nccl_grad_is_view 2024-12-18T05:38:53.8368518Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_module 2024-12-18T05:38:53.8370403Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_False 2024-12-18T05:38:53.8372406Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_weight_sharing_use_reentrant_False 2024-12-18T05:38:53.8374414Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_allreduce_hook_nccl_grad_is_view 2024-12-18T05:38:53.8376470Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_allreduce_hook_nccl_static_graph 2024-12-18T05:38:53.8378370Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_gpu_nccl 2024-12-18T05:38:53.8380163Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_multi_device_module_config 2024-12-18T05:38:53.8381806Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_weight_sharing 2024-12-18T05:38:53.8384090Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_default_ddp_comm_hooks_nccl 2024-12-18T05:38:53.8386526Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_find_unused_parameters_kwarg_debug_detail 2024-12-18T05:38:53.8388168Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_grad_layout_2devicemodule 2024-12-18T05:38:53.8389022Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_invalid_powerSGD_state 2024-12-18T05:38:53.8389902Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_multiple_outputs_multiple_backward 2024-12-18T05:38:53.8390869Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_1gpu_module_device_ids_integer_list 2024-12-18T05:38:53.8391914Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_1gpu_module_device_ids_torch_device_list 2024-12-18T05:38:53.8392938Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_multi_device_module_device_ids_None 2024-12-18T05:38:53.8394249Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_pass_default_pg 2024-12-18T05:38:53.8396251Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_powerSGD_ddp_comm_hook_nccl_grad_is_view 2024-12-18T05:38:53.8397929Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::WorkHookTest::test_on_completion_hook_mixed_ops 2024-12-18T05:38:53.8399477Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_nccl_errors_blocking_clean_exit 2024-12-18T05:38:53.8401092Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_nccl_errors_blocking_sigkill 2024-12-18T05:38:53.8402634Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_nccl_errors_nonblocking 2024-12-18T05:38:53.8404125Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_all_reduce_coalesced_manager_nccl 2024-12-18T05:38:53.8405559Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_all_reduce_coalesced_nccl 2024-12-18T05:38:53.8406932Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_broadcast_coalesced_nccl 2024-12-18T05:38:53.8408214Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_nccl_barrier 2024-12-18T05:38:53.8409498Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_nccl_barrier_device_ids 2024-12-18T05:38:53.8410891Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_nccl_warn_not_in_group_debug_off 2024-12-18T05:38:53.8412287Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_nncl_rank_membership 2024-12-18T05:38:53.8413769Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_pass_nccl_options_high_priority_stream 2024-12-18T05:38:53.8415332Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_reduce_scatter_base_k 2024-12-18T05:38:53.8416737Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_reduce_scatter_base_k_float8_errors 2024-12-18T05:38:53.8418050Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_unwaited 2024-12-18T05:38:53.8419554Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_collectives 2024-12-18T05:38:53.8421299Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_batch_send_recv_subgroup_group_rank_False 2024-12-18T05:38:53.8423680Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_object_list_subgroup_set_device0_group_rank_True 2024-12-18T05:38:53.8425725Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_subgroup_group_rank_False 2024-12-18T05:38:53.8427886Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_subgroup_group_rank_True 2024-12-18T05:38:53.8428778Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_object_subgroup_group_rank_False 2024-12-18T05:38:53.8429583Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_subgroup_group_rank_False 2024-12-18T05:38:53.8430318Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_new_group_local_sync 2024-12-18T05:38:53.8431116Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_scatter_object_list_subgroup_group_rank_True 2024-12-18T05:38:53.8432026Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device0_group_rank_True 2024-12-18T05:38:53.8432973Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device1_group_rank_True 2024-12-18T05:38:53.8434088Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_subgroup_group_rank_True_async_op_True 2024-12-18T05:38:53.8436265Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce0_timing_enabled_False 2024-12-18T05:38:53.8438179Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce1_timing_enabled_False 2024-12-18T05:38:53.8440062Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce1_timing_enabled_True 2024-12-18T05:38:53.8441896Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes0_timing_enabled_False 2024-12-18T05:38:53.8443666Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes0_timing_enabled_True 2024-12-18T05:38:53.8445467Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_False_include_collectives_True 2024-12-18T05:38:53.8447290Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_True_include_collectives_False 2024-12-18T05:38:53.8449118Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_pickle_timing_enabled_False_include_collectives_True 2024-12-18T05:38:53.8450946Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_trace_while_active_timing_enabled_True_only_active_False 2024-12-18T05:38:53.8452741Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLLargerScaleTest::test_comm_recursive_split_group 2024-12-18T05:38:53.8453738Z 2024-12-18T05:38:53.8454211Z Running distributed/fsdp/test_fsdp_optim_state 1/2 ... [2024-12-18 05:38:53.825321] 2024-12-18T05:38:53.8455178Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:38:53.8457109Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_optim_state.py', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:38:53.825599] 2024-12-18T05:45:51.3180329Z 2024-12-18T05:45:51.3182345Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_optim_state 1/2 (test/test-reports/distributed.fsdp.test_fsdp_optim_state_1.2_0a1ae362c4d810ea_.log) 2024-12-18T05:45:51.3186027Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_optim_state/distributed.fsdp.test_fsdp_optim_state-cf12ea386aa05c69.xml 2024-12-18T05:45:51.3186884Z ============================= test session starts ============================== 2024-12-18T05:45:51.3187642Z platform linux -- Python 3.10.15, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-12-18T05:45:51.3188148Z cachedir: .pytest_cache 2024-12-18T05:45:51.3188714Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-12-18T05:45:51.3189314Z rootdir: /var/lib/jenkins/pytorch 2024-12-18T05:45:51.3189596Z configfile: pytest.ini 2024-12-18T05:45:51.3190160Z plugins: xdist-3.3.1, hypothesis-5.35.1, cpp-2.3.0, subtests-0.13.1, rerunfailures-14.0, flakefinder-1.1.0, xdoctest-1.1.0, typeguard-4.3.0 2024-12-18T05:45:51.3191546Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:292: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_optim_state.py) 2024-12-18T05:45:51.3192691Z class TestDummyModel(torch.nn.Module): 2024-12-18T05:45:51.3193008Z collected 60 items 2024-12-18T05:45:51.3193298Z stepcurrent: Cannot find last run test, not skipping 2024-12-18T05:45:51.3211656Z Running 35 items in this shard: test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_compatible_with_trec, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_flatten_sharded_optim_state_dict_nested, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_full_optim_state_dict_keys, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_full_optim_state_dict_nested_invalid, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_no_grad, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_input_warning, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_rekey_optim_state_dict_to_ids_state_dict_type0_use_multiple_param_groups_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_rekey_optim_state_dict_to_ids_state_dict_type1_use_multiple_param_groups_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_rekey_optim_state_dict_to_ids_state_dict_type1_use_multiple_param_groups_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_rekey_optim_state_dict_to_names, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_save_load_without_0th_param_state_state_dict_type0, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_save_load_without_0th_param_state_state_dict_type1, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_transformer, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_transformer, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_unmanaged_params_state_dict_type1_add_to_fsdp_module_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_state_dict_with_none_tensor_state, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_with_empty_optimizer_state, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_with_no_shard 2024-12-18T05:45:51.3229783Z 2024-12-18T05:45:51.3230880Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_compatible_with_trec /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/named_optimizer.py:98: UserWarning: Since we pass in param_groups, we will use param_groups to initialize the optimizer, not all parameters of the module. 2024-12-18T05:45:51.3232138Z warnings.warn( 2024-12-18T05:45:51.3233017Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/named_optimizer.py:98: UserWarning: Since we pass in param_groups, we will use param_groups to initialize the optimizer, not all parameters of the module. 2024-12-18T05:45:51.3233950Z warnings.warn( 2024-12-18T05:45:51.3236815Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:823: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:62.) 2024-12-18T05:45:51.3241551Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2024-12-18T05:45:51.3246558Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:823: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:62.) 2024-12-18T05:45:51.3252137Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2024-12-18T05:45:51.3253074Z dist init r=1, world=2 2024-12-18T05:45:51.3253526Z dist init r=0, world=2 2024-12-18T05:45:51.3253989Z PASSED [11.4662s] [ 2%] 2024-12-18T05:45:51.3257328Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_flatten_sharded_optim_state_dict_nested /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3261189Z else osd_method(model1, optim1) 2024-12-18T05:45:51.3264209Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3266987Z warnings.warn( 2024-12-18T05:45:51.3269250Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3270556Z else osd_method(model1, optim1) 2024-12-18T05:45:51.3271970Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3273379Z warnings.warn( 2024-12-18T05:45:51.3274627Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3275957Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.3277299Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3279608Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.3282829Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1091: FutureWarning: ``FullyShardedDataParallel.flatten_sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.flatten_sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3285627Z sharded_osd1 = FSDP.flatten_sharded_optim_state_dict( 2024-12-18T05:45:51.3288417Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1091: FutureWarning: ``FullyShardedDataParallel.flatten_sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.flatten_sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3291189Z sharded_osd1 = FSDP.flatten_sharded_optim_state_dict( 2024-12-18T05:45:51.3293965Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1096: FutureWarning: ``FullyShardedDataParallel.flatten_sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.flatten_sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3296882Z sharded_osd2 = FSDP.flatten_sharded_optim_state_dict( 2024-12-18T05:45:51.3300047Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1096: FutureWarning: ``FullyShardedDataParallel.flatten_sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.flatten_sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3303201Z sharded_osd2 = FSDP.flatten_sharded_optim_state_dict( 2024-12-18T05:45:51.3304787Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3306400Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.3307979Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3309424Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.3310235Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3311098Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.3311954Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3312801Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.3313659Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3314505Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.3315359Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3316207Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.3318538Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.3323025Z warnings.warn( 2024-12-18T05:45:51.3327021Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.3330827Z warnings.warn( 2024-12-18T05:45:51.3331246Z dist init r=0, world=2 2024-12-18T05:45:51.3331688Z dist init r=1, world=2 2024-12-18T05:45:51.3332129Z PASSED [11.6241s] [ 5%] 2024-12-18T05:45:51.3335388Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_full_optim_state_dict_keys /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:626: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3338737Z optim_state_dict = FSDP.full_optim_state_dict( 2024-12-18T05:45:51.3341737Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:626: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3344433Z optim_state_dict = FSDP.full_optim_state_dict( 2024-12-18T05:45:51.3347315Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3349710Z warnings.warn( 2024-12-18T05:45:51.3351080Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3352488Z warnings.warn( 2024-12-18T05:45:51.3354404Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.3356354Z warnings.warn( 2024-12-18T05:45:51.3358259Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.3361710Z warnings.warn( 2024-12-18T05:45:51.3362434Z dist init r=0, world=2 2024-12-18T05:45:51.3362870Z dist init r=1, world=2 2024-12-18T05:45:51.3363314Z PASSED [11.7247s] [ 8%] 2024-12-18T05:45:51.3366729Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_full_optim_state_dict_nested_invalid /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:655: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3369786Z FSDP.full_optim_state_dict(model, optim) 2024-12-18T05:45:51.3371066Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:655: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3372348Z FSDP.full_optim_state_dict(model, optim) 2024-12-18T05:45:51.3373801Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3375286Z warnings.warn( 2024-12-18T05:45:51.3376643Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3378052Z warnings.warn( 2024-12-18T05:45:51.3378269Z dist init r=1, world=2 2024-12-18T05:45:51.3378501Z dist init r=0, world=2 2024-12-18T05:45:51.3378731Z PASSED [11.0239s] [ 11%] 2024-12-18T05:45:51.3379194Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_no_grad dist init r=1, world=2 2024-12-18T05:45:51.3379680Z dist init r=0, world=2 2024-12-18T05:45:51.3379906Z PASSED [11.2227s] [ 14%] 2024-12-18T05:45:51.3380403Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_input_warning dist init r=0, world=2 2024-12-18T05:45:51.3380931Z dist init r=1, world=2 2024-12-18T05:45:51.3381288Z PASSED [11.2237s] [ 17%] 2024-12-18T05:45:51.3384887Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:574: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3388595Z fsdp_osd = FSDP.full_optim_state_dict( 2024-12-18T05:45:51.3390230Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:574: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3391510Z fsdp_osd = FSDP.full_optim_state_dict( 2024-12-18T05:45:51.3392965Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3394551Z warnings.warn( 2024-12-18T05:45:51.3396061Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3397475Z warnings.warn( 2024-12-18T05:45:51.3397954Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] Caught exception: 2024-12-18T05:45:51.3399161Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] Traceback (most recent call last): 2024-12-18T05:45:51.3401885Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 726, in run_test 2024-12-18T05:45:51.3404229Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] getattr(self, test_name)() 2024-12-18T05:45:51.3406578Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 599, in wrapper 2024-12-18T05:45:51.3408749Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] fn() 2024-12-18T05:45:51.3410901Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3108, in wrapper 2024-12-18T05:45:51.3413170Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] method(*args, **kwargs) 2024-12-18T05:45:51.3415601Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 557, in instantiated_test 2024-12-18T05:45:51.3417950Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] test(self, **param_kwargs) 2024-12-18T05:45:51.3420297Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 199, in wrapper 2024-12-18T05:45:51.3423054Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] return func(*args, **kwargs) 2024-12-18T05:45:51.3425838Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py", line 539, in test_optim_state_dict_nested 2024-12-18T05:45:51.3428230Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] self.run_subtests( 2024-12-18T05:45:51.3430498Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1186, in run_subtests 2024-12-18T05:45:51.3432874Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] return run_subtests(self, *args, **kwargs) 2024-12-18T05:45:51.3434159Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 993, in run_subtests 2024-12-18T05:45:51.3435437Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2024-12-18T05:45:51.3436888Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py", line 594, in _test_optim_state_dict_nested 2024-12-18T05:45:51.3438352Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] assert l1 == l2, f"Losses differ on iter {i}: {l1:.5f} {l2:.5f}" 2024-12-18T05:45:51.3440378Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] AssertionError: Losses differ on iter 0: -0.37255 190.85599 2024-12-18T05:45:51.3442180Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] 2024-12-18T05:45:51.3443867Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] To execute this test, run the following from the base repo dir: 2024-12-18T05:45:51.3447064Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] PYTORCH_TEST_WITH_ROCM=1 python test/distributed/fsdp/test_fsdp_optim_state.py TestFSDPOptimState.test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False 2024-12-18T05:45:51.3450153Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] 2024-12-18T05:45:51.3452260Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2024-12-18T05:45:51.3454679Z [rank1]:E1218 05:40:16.392000 174276 site-packages/torch/testing/_internal/common_distributed.py:733] exiting process 1 with exit code: 10 2024-12-18T05:45:51.3455843Z dist init r=1, world=2 2024-12-18T05:45:51.3456335Z ('RERUN', {'yellow': True}) [11.2234s] [ 20%] 2024-12-18T05:45:51.3460083Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:574: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3462378Z fsdp_osd = FSDP.full_optim_state_dict( 2024-12-18T05:45:51.3463839Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3465266Z warnings.warn( 2024-12-18T05:45:51.3466486Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:574: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3467793Z fsdp_osd = FSDP.full_optim_state_dict( 2024-12-18T05:45:51.3469242Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3470660Z warnings.warn( 2024-12-18T05:45:51.3471143Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] Caught exception: 2024-12-18T05:45:51.3472150Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] Traceback (most recent call last): 2024-12-18T05:45:51.3473524Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 726, in run_test 2024-12-18T05:45:51.3474735Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] getattr(self, test_name)() 2024-12-18T05:45:51.3475931Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 599, in wrapper 2024-12-18T05:45:51.3477051Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] fn() 2024-12-18T05:45:51.3478164Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3108, in wrapper 2024-12-18T05:45:51.3480357Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] method(*args, **kwargs) 2024-12-18T05:45:51.3482814Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 557, in instantiated_test 2024-12-18T05:45:51.3485158Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] test(self, **param_kwargs) 2024-12-18T05:45:51.3487493Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 199, in wrapper 2024-12-18T05:45:51.3489837Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] return func(*args, **kwargs) 2024-12-18T05:45:51.3492181Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py", line 539, in test_optim_state_dict_nested 2024-12-18T05:45:51.3494452Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] self.run_subtests( 2024-12-18T05:45:51.3496864Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1186, in run_subtests 2024-12-18T05:45:51.3499262Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] return run_subtests(self, *args, **kwargs) 2024-12-18T05:45:51.3501116Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 993, in run_subtests 2024-12-18T05:45:51.3502993Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2024-12-18T05:45:51.3504848Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py", line 594, in _test_optim_state_dict_nested 2024-12-18T05:45:51.3506740Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] assert l1 == l2, f"Losses differ on iter {i}: {l1:.5f} {l2:.5f}" 2024-12-18T05:45:51.3508582Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] AssertionError: Losses differ on iter 0: -0.37255 190.85599 2024-12-18T05:45:51.3510055Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] 2024-12-18T05:45:51.3511804Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] To execute this test, run the following from the base repo dir: 2024-12-18T05:45:51.3514366Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] PYTORCH_TEST_WITH_ROCM=1 python test/distributed/fsdp/test_fsdp_optim_state.py TestFSDPOptimState.test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False 2024-12-18T05:45:51.3515782Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] 2024-12-18T05:45:51.3516686Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2024-12-18T05:45:51.3517719Z [rank0]:E1218 05:40:27.545000 174430 site-packages/torch/testing/_internal/common_distributed.py:733] exiting process 0 with exit code: 10 2024-12-18T05:45:51.3518310Z dist init r=0, world=2 2024-12-18T05:45:51.3518572Z ('RERUN', {'yellow': True}) [11.0238s] [ 20%] 2024-12-18T05:45:51.3521092Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:574: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3524056Z fsdp_osd = FSDP.full_optim_state_dict( 2024-12-18T05:45:51.3525939Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:574: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3527818Z fsdp_osd = FSDP.full_optim_state_dict( 2024-12-18T05:45:51.3529944Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3532026Z warnings.warn( 2024-12-18T05:45:51.3534030Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3536181Z warnings.warn( 2024-12-18T05:45:51.3537967Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:567: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3539834Z fsdp_osd = FSDP.full_optim_state_dict( 2024-12-18T05:45:51.3541259Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:567: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3542932Z fsdp_osd = FSDP.full_optim_state_dict( 2024-12-18T05:45:51.3544992Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:567: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3546874Z fsdp_osd = FSDP.full_optim_state_dict( 2024-12-18T05:45:51.3548309Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:567: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3549993Z fsdp_osd = FSDP.full_optim_state_dict( 2024-12-18T05:45:51.3550480Z dist init r=0, world=2 2024-12-18T05:45:51.3550885Z dist init r=1, world=2 2024-12-18T05:45:51.3551279Z PASSED [11.2219s] [ 20%] 2024-12-18T05:45:51.3554302Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_False /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:580: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3556267Z fsdp_osd = FSDP.sharded_optim_state_dict(model1, optim1) 2024-12-18T05:45:51.3557624Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:580: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3558969Z fsdp_osd = FSDP.sharded_optim_state_dict(model1, optim1) 2024-12-18T05:45:51.3561144Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3563387Z warnings.warn( 2024-12-18T05:45:51.3565382Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3567456Z warnings.warn( 2024-12-18T05:45:51.3568154Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] Caught exception: 2024-12-18T05:45:51.3569390Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] Traceback (most recent call last): 2024-12-18T05:45:51.3571186Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 726, in run_test 2024-12-18T05:45:51.3572940Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] getattr(self, test_name)() 2024-12-18T05:45:51.3574764Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 599, in wrapper 2024-12-18T05:45:51.3576401Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] fn() 2024-12-18T05:45:51.3578267Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3108, in wrapper 2024-12-18T05:45:51.3580171Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] method(*args, **kwargs) 2024-12-18T05:45:51.3581927Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 557, in instantiated_test 2024-12-18T05:45:51.3583696Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] test(self, **param_kwargs) 2024-12-18T05:45:51.3585437Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 199, in wrapper 2024-12-18T05:45:51.3587191Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] return func(*args, **kwargs) 2024-12-18T05:45:51.3589107Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py", line 539, in test_optim_state_dict_nested 2024-12-18T05:45:51.3591161Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] self.run_subtests( 2024-12-18T05:45:51.3593181Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1186, in run_subtests 2024-12-18T05:45:51.3594652Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] return run_subtests(self, *args, **kwargs) 2024-12-18T05:45:51.3595915Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 993, in run_subtests 2024-12-18T05:45:51.3597198Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2024-12-18T05:45:51.3598469Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py", line 594, in _test_optim_state_dict_nested 2024-12-18T05:45:51.3600073Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] assert l1 == l2, f"Losses differ on iter {i}: {l1:.5f} {l2:.5f}" 2024-12-18T05:45:51.3601901Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] AssertionError: Losses differ on iter 0: -0.37255 190.85599 2024-12-18T05:45:51.3603159Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] 2024-12-18T05:45:51.3604431Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] To execute this test, run the following from the base repo dir: 2024-12-18T05:45:51.3606830Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] PYTORCH_TEST_WITH_ROCM=1 python test/distributed/fsdp/test_fsdp_optim_state.py TestFSDPOptimState.test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_False 2024-12-18T05:45:51.3608908Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] 2024-12-18T05:45:51.3610237Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2024-12-18T05:45:51.3611950Z [rank1]:E1218 05:40:49.814000 174743 site-packages/torch/testing/_internal/common_distributed.py:733] exiting process 1 with exit code: 10 2024-12-18T05:45:51.3612993Z dist init r=1, world=2 2024-12-18T05:45:51.3614052Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3615336Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.3616530Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3617794Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.3619246Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3620725Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.3621427Z ('RERUN', {'yellow': True}) [11.0244s] [ 22%] 2024-12-18T05:45:51.3624104Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_False /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:580: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3625683Z fsdp_osd = FSDP.sharded_optim_state_dict(model1, optim1) 2024-12-18T05:45:51.3626764Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:580: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3627851Z fsdp_osd = FSDP.sharded_optim_state_dict(model1, optim1) 2024-12-18T05:45:51.3629049Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3630181Z warnings.warn( 2024-12-18T05:45:51.3631273Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3632402Z warnings.warn( 2024-12-18T05:45:51.3632962Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3633614Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.3634260Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3634950Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.3635633Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3636444Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.3637241Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3637895Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.3638537Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3639215Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.3639895Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3640576Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.3640880Z dist init r=0, world=2 2024-12-18T05:45:51.3641067Z dist init r=1, world=2 2024-12-18T05:45:51.3641258Z PASSED [11.7234s] [ 22%] 2024-12-18T05:45:51.3642799Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_True /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:580: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3644379Z fsdp_osd = FSDP.sharded_optim_state_dict(model1, optim1) 2024-12-18T05:45:51.3645466Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:580: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3646556Z fsdp_osd = FSDP.sharded_optim_state_dict(model1, optim1) 2024-12-18T05:45:51.3647770Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3648921Z warnings.warn( 2024-12-18T05:45:51.3650018Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3651160Z warnings.warn( 2024-12-18T05:45:51.3651724Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3652381Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.3653032Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3653721Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.3654408Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3655221Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.3655874Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3656561Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.3657383Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3658075Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.3658756Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3659439Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.3659746Z dist init r=1, world=2 2024-12-18T05:45:51.3659930Z dist init r=0, world=2 2024-12-18T05:45:51.3660116Z PASSED [11.0238s] [ 25%] 2024-12-18T05:45:51.3660760Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_False dist init r=0, world=2 2024-12-18T05:45:51.3661429Z dist init r=1, world=2 2024-12-18T05:45:51.3661612Z PASSED [6.4139s] [ 28%] 2024-12-18T05:45:51.3662246Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_True dist init r=1, world=2 2024-12-18T05:45:51.3662905Z dist init r=0, world=2 2024-12-18T05:45:51.3663086Z PASSED [6.4155s] [ 31%] 2024-12-18T05:45:51.3664580Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:580: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3666157Z fsdp_osd = FSDP.sharded_optim_state_dict(model1, optim1) 2024-12-18T05:45:51.3667243Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:580: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3668320Z fsdp_osd = FSDP.sharded_optim_state_dict(model1, optim1) 2024-12-18T05:45:51.3669516Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3670657Z warnings.warn( 2024-12-18T05:45:51.3671752Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3672881Z warnings.warn( 2024-12-18T05:45:51.3673437Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3674092Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.3674869Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3675554Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.3676440Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3677133Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.3677811Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3678458Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.3679101Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3679789Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.3680479Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3681163Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.3681473Z dist init r=1, world=2 2024-12-18T05:45:51.3681664Z dist init r=0, world=2 2024-12-18T05:45:51.3681850Z PASSED [11.6250s] [ 34%] 2024-12-18T05:45:51.3683354Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_True /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:580: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3684923Z fsdp_osd = FSDP.sharded_optim_state_dict(model1, optim1) 2024-12-18T05:45:51.3686008Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:580: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3687092Z fsdp_osd = FSDP.sharded_optim_state_dict(model1, optim1) 2024-12-18T05:45:51.3688291Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3689428Z warnings.warn( 2024-12-18T05:45:51.3690524Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3691650Z warnings.warn( 2024-12-18T05:45:51.3692208Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3692858Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.3693501Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3694298Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.3695187Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3695870Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.3696548Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3697196Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.3697835Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3698526Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.3699207Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3699888Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.3700193Z dist init r=1, world=2 2024-12-18T05:45:51.3700379Z dist init r=0, world=2 2024-12-18T05:45:51.3700562Z PASSED [11.2238s] [ 37%] 2024-12-18T05:45:51.3701202Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_False dist init r=0, world=2 2024-12-18T05:45:51.3701867Z dist init r=1, world=2 2024-12-18T05:45:51.3702052Z PASSED [6.3144s] [ 40%] 2024-12-18T05:45:51.3702681Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_True dist init r=0, world=2 2024-12-18T05:45:51.3703340Z dist init r=1, world=2 2024-12-18T05:45:51.3703525Z PASSED [6.4161s] [ 42%] 2024-12-18T05:45:51.3704901Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_rekey_optim_state_dict_to_ids_state_dict_type0_use_multiple_param_groups_True /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1294: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3706331Z else FSDP.full_optim_state_dict(model1, optim1) 2024-12-18T05:45:51.3707380Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1294: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3708436Z else FSDP.full_optim_state_dict(model1, optim1) 2024-12-18T05:45:51.3709615Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3710742Z warnings.warn( 2024-12-18T05:45:51.3711836Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3713093Z warnings.warn( 2024-12-18T05:45:51.3714166Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1292: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3715238Z FSDP.full_optim_state_dict(model1, optim1, optim_input1) 2024-12-18T05:45:51.3716069Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1292: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3716896Z FSDP.full_optim_state_dict(model1, optim1, optim_input1) 2024-12-18T05:45:51.3717965Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1292: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3719026Z FSDP.full_optim_state_dict(model1, optim1, optim_input1) 2024-12-18T05:45:51.3719847Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1292: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3720662Z FSDP.full_optim_state_dict(model1, optim1, optim_input1) 2024-12-18T05:45:51.3721570Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1730: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3722499Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.3723428Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1730: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3724350Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.3724631Z dist init r=0, world=2 2024-12-18T05:45:51.3724818Z dist init r=1, world=2 2024-12-18T05:45:51.3725002Z PASSED [11.2237s] [ 45%] 2024-12-18T05:45:51.3726412Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_rekey_optim_state_dict_to_ids_state_dict_type1_use_multiple_param_groups_False /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1300: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3727892Z fsdp_osd = FSDP.sharded_optim_state_dict(model1, optim1) 2024-12-18T05:45:51.3728977Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1300: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3730057Z fsdp_osd = FSDP.sharded_optim_state_dict(model1, optim1) 2024-12-18T05:45:51.3731249Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3732494Z warnings.warn( 2024-12-18T05:45:51.3733698Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3734872Z warnings.warn( 2024-12-18T05:45:51.3735428Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3736085Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.3736730Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3737427Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.3738114Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3738761Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.3739409Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3740089Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.3740772Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3741454Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.3742139Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3742817Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.3743124Z dist init r=0, world=2 2024-12-18T05:45:51.3743315Z dist init r=1, world=2 2024-12-18T05:45:51.3743503Z PASSED [11.1236s] [ 48%] 2024-12-18T05:45:51.3744907Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_rekey_optim_state_dict_to_ids_state_dict_type1_use_multiple_param_groups_True /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1300: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3746377Z fsdp_osd = FSDP.sharded_optim_state_dict(model1, optim1) 2024-12-18T05:45:51.3747576Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3748712Z warnings.warn( 2024-12-18T05:45:51.3749719Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1300: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3750803Z fsdp_osd = FSDP.sharded_optim_state_dict(model1, optim1) 2024-12-18T05:45:51.3751989Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3753260Z warnings.warn( 2024-12-18T05:45:51.3753930Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3754592Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.3755248Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3755934Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.3756622Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3757307Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.3758001Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3758649Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.3759289Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3759972Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.3760657Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3761340Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.3761643Z dist init r=0, world=2 2024-12-18T05:45:51.3761829Z dist init r=1, world=2 2024-12-18T05:45:51.3762011Z PASSED [11.2248s] [ 51%] 2024-12-18T05:45:51.3763327Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_rekey_optim_state_dict_to_names /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1401: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3764672Z else FSDP.shard_full_optim_state_dict( 2024-12-18T05:45:51.3765747Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1401: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3766820Z else FSDP.shard_full_optim_state_dict( 2024-12-18T05:45:51.3767708Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1730: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3768640Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.3769760Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1395: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3770829Z FSDP.shard_full_optim_state_dict( 2024-12-18T05:45:51.3771804Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3772816Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.3773734Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1730: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3774710Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.3775826Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1395: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3776896Z FSDP.shard_full_optim_state_dict( 2024-12-18T05:45:51.3777761Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3778683Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.3778960Z dist init r=0, world=2 2024-12-18T05:45:51.3779144Z dist init r=1, world=2 2024-12-18T05:45:51.3779336Z PASSED [11.7250s] [ 54%] 2024-12-18T05:45:51.3780647Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_save_load_without_0th_param_state_state_dict_type0 /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1569: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3782073Z fsdp_osd = FSDP.full_optim_state_dict(fsdp_model, optim, rank0_only=False) 2024-12-18T05:45:51.3783328Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3784466Z warnings.warn( 2024-12-18T05:45:51.3785446Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1569: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3786552Z fsdp_osd = FSDP.full_optim_state_dict(fsdp_model, optim, rank0_only=False) 2024-12-18T05:45:51.3787795Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3788948Z warnings.warn( 2024-12-18T05:45:51.3789975Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1570: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3791256Z flattened_osd = FSDP.shard_full_optim_state_dict(fsdp_osd, fsdp_model) 2024-12-18T05:45:51.3792546Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1570: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3793716Z flattened_osd = FSDP.shard_full_optim_state_dict(fsdp_osd, fsdp_model) 2024-12-18T05:45:51.3794026Z dist init r=1, world=2 2024-12-18T05:45:51.3794215Z dist init r=0, world=2 2024-12-18T05:45:51.3794397Z PASSED [11.1223s] [ 57%] 2024-12-18T05:45:51.3795751Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_save_load_without_0th_param_state_state_dict_type1 /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1572: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3797182Z fsdp_osd = FSDP.sharded_optim_state_dict(fsdp_model, optim) 2024-12-18T05:45:51.3798289Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1572: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3799390Z fsdp_osd = FSDP.sharded_optim_state_dict(fsdp_model, optim) 2024-12-18T05:45:51.3800608Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3801755Z warnings.warn( 2024-12-18T05:45:51.3802856Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3803990Z warnings.warn( 2024-12-18T05:45:51.3805060Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1573: FutureWarning: ``FullyShardedDataParallel.flatten_sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.flatten_sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3806215Z flattened_osd = FSDP.flatten_sharded_optim_state_dict( 2024-12-18T05:45:51.3807372Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1573: FutureWarning: ``FullyShardedDataParallel.flatten_sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.flatten_sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3808525Z flattened_osd = FSDP.flatten_sharded_optim_state_dict( 2024-12-18T05:45:51.3809179Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3809838Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.3810492Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3811321Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.3812014Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3812757Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.3813406Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3814094Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.3814827Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3815516Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.3816206Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.3816891Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.3817198Z dist init r=0, world=2 2024-12-18T05:45:51.3817393Z dist init r=1, world=2 2024-12-18T05:45:51.3817580Z PASSED [11.1222s] [ 60%] 2024-12-18T05:45:51.3819064Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_True /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3820576Z else osd_method(model1, optim1) 2024-12-18T05:45:51.3821610Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3822642Z else osd_method(model1, optim1) 2024-12-18T05:45:51.3823794Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3824942Z warnings.warn( 2024-12-18T05:45:51.3826048Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3827188Z warnings.warn( 2024-12-18T05:45:51.3828170Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3829226Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.3830284Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3831459Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.3832685Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1068: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3833793Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3834888Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1068: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3835985Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3837083Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1083: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3838193Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3839288Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1083: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3840386Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3841413Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3842453Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.3843258Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3844062Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.3845100Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3846148Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.3846945Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3847743Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.3848783Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3849865Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.3850936Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3852120Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.3853053Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3853879Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.3854770Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3855597Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.3856733Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1061: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3857845Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3858947Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1061: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3860042Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3860921Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3861855Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.3862786Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3863712Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.3864857Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1076: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3865955Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3867044Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1076: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3868192Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3869809Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.3871395Z warnings.warn( 2024-12-18T05:45:51.3873065Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.3883473Z warnings.warn( 2024-12-18T05:45:51.3883656Z dist init r=0, world=2 2024-12-18T05:45:51.3883844Z dist init r=1, world=2 2024-12-18T05:45:51.3884031Z PASSED [12.2241s] [ 62%] 2024-12-18T05:45:51.3885522Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_False /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3887039Z else osd_method(model1, optim1) 2024-12-18T05:45:51.3888062Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3889092Z else osd_method(model1, optim1) 2024-12-18T05:45:51.3890239Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3891386Z warnings.warn( 2024-12-18T05:45:51.3892488Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3893622Z warnings.warn( 2024-12-18T05:45:51.3894651Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3895705Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.3896761Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3897812Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.3898931Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1068: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3900034Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3901139Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1068: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3902398Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3903607Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1083: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3904711Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3905811Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1083: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3906911Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3907944Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3908986Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.3909777Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3910567Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.3911602Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3912639Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.3913430Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3914219Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.3915249Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3916325Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.3917160Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3917994Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.3919067Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3920140Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.3920967Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3921903Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.3923140Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1061: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3924246Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3925131Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3926063Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.3927213Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1061: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3928322Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3929197Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3930123Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.3931275Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1076: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3932375Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3933461Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1076: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3934603Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3936196Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.3937776Z warnings.warn( 2024-12-18T05:45:51.3939316Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.3940889Z warnings.warn( 2024-12-18T05:45:51.3941066Z dist init r=1, world=2 2024-12-18T05:45:51.3941255Z dist init r=0, world=2 2024-12-18T05:45:51.3941441Z PASSED [11.7243s] [ 65%] 2024-12-18T05:45:51.3943052Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_True /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3944677Z else osd_method(model1, optim1) 2024-12-18T05:45:51.3945692Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3946726Z else osd_method(model1, optim1) 2024-12-18T05:45:51.3947875Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3949026Z warnings.warn( 2024-12-18T05:45:51.3950119Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.3951255Z warnings.warn( 2024-12-18T05:45:51.3952237Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3953297Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.3954356Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3955406Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.3956518Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1068: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3957628Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3958736Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1068: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3959842Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3960942Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1083: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3962043Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3963136Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1083: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3964354Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3965497Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3966540Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.3967333Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3968128Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.3969165Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3970205Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.3970995Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3971785Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.3972816Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3973903Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.3974783Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3975608Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.3976683Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3977758Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.3978579Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3979402Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.3980538Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1061: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3981636Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3982515Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3983586Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.3984849Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1061: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3985949Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3986825Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.3987750Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.3988898Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1076: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3990000Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3991082Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1076: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.3992177Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.3993792Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.3995372Z warnings.warn( 2024-12-18T05:45:51.3996914Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.3998486Z warnings.warn( 2024-12-18T05:45:51.3998664Z dist init r=0, world=2 2024-12-18T05:45:51.3998852Z dist init r=1, world=2 2024-12-18T05:45:51.3999040Z PASSED [11.6230s] [ 68%] 2024-12-18T05:45:51.4000528Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_True /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4002042Z else osd_method(model1, optim1) 2024-12-18T05:45:51.4003067Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4004295Z else osd_method(model1, optim1) 2024-12-18T05:45:51.4005539Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4006690Z warnings.warn( 2024-12-18T05:45:51.4007791Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4008945Z warnings.warn( 2024-12-18T05:45:51.4009932Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4010986Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.4012040Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4013097Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.4014210Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1068: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4015362Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4016473Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1068: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4017577Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4018674Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1083: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4019774Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4020869Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1083: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4021970Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4023006Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4024041Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4024965Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4025757Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4026907Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4027946Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4028731Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4029523Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4030559Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4031635Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4032466Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4033300Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4034374Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4035447Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4036283Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4037106Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4038244Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1061: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4039346Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4040441Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1061: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4041533Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4042415Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4043349Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.4044280Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4045326Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.4046586Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1076: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4047686Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4048776Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1076: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4049875Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4051482Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.4053066Z warnings.warn( 2024-12-18T05:45:51.4054641Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.4056214Z warnings.warn( 2024-12-18T05:45:51.4056394Z dist init r=1, world=2 2024-12-18T05:45:51.4056589Z dist init r=0, world=2 2024-12-18T05:45:51.4056774Z PASSED [12.1258s] [ 71%] 2024-12-18T05:45:51.4058258Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_False /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4059772Z else osd_method(model1, optim1) 2024-12-18T05:45:51.4060924Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4062073Z warnings.warn( 2024-12-18T05:45:51.4063049Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4064075Z else osd_method(model1, optim1) 2024-12-18T05:45:51.4065213Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4066495Z warnings.warn( 2024-12-18T05:45:51.4067591Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4068644Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.4069697Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4070754Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.4071879Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1068: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4072989Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4074093Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1068: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4075194Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4076300Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1083: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4077397Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4078491Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1083: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4079592Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4080627Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4081677Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4082474Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4083263Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4084301Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4085333Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4086232Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4087020Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4088150Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4089228Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4090061Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4090892Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4091980Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4093048Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4093874Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4094735Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4095869Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1061: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4096976Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4098070Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1061: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4099165Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4100044Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4100984Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.4101914Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4102836Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.4103983Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1076: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4105076Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4106162Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1076: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4107383Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4109096Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.4110676Z warnings.warn( 2024-12-18T05:45:51.4112220Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.4113787Z warnings.warn( 2024-12-18T05:45:51.4113963Z dist init r=1, world=2 2024-12-18T05:45:51.4114153Z dist init r=0, world=2 2024-12-18T05:45:51.4114339Z PASSED [11.7242s] [ 74%] 2024-12-18T05:45:51.4115818Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_True /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4117324Z else osd_method(model1, optim1) 2024-12-18T05:45:51.4118350Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4119379Z else osd_method(model1, optim1) 2024-12-18T05:45:51.4120532Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4121686Z warnings.warn( 2024-12-18T05:45:51.4122792Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4123931Z warnings.warn( 2024-12-18T05:45:51.4124911Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4125966Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.4127169Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4128315Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.4129435Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1068: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4130609Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4131710Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1068: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4132818Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4133916Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1083: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4135053Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4136153Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1083: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4137260Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4138294Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4139343Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4140133Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4140927Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4141958Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4143004Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4143790Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4144576Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4145601Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4146672Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4147633Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4148575Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4149657Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4150724Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4151549Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4152374Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4153517Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1061: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4154619Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4155709Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1061: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4156804Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4157682Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4158613Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.4159544Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4160466Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.4161612Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1076: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4162715Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4163804Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1076: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4164896Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4166495Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.4168185Z warnings.warn( 2024-12-18T05:45:51.4169821Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.4171396Z warnings.warn( 2024-12-18T05:45:51.4171574Z dist init r=0, world=2 2024-12-18T05:45:51.4171761Z dist init r=1, world=2 2024-12-18T05:45:51.4171943Z PASSED [11.7245s] [ 77%] 2024-12-18T05:45:51.4173113Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_transformer /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:375: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2024-12-18T05:45:51.4174306Z warnings.warn( 2024-12-18T05:45:51.4175184Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:375: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2024-12-18T05:45:51.4176048Z warnings.warn( 2024-12-18T05:45:51.4177036Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4178073Z else osd_method(model1, optim1) 2024-12-18T05:45:51.4179094Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4180129Z else osd_method(model1, optim1) 2024-12-18T05:45:51.4181273Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4182418Z warnings.warn( 2024-12-18T05:45:51.4183528Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4184666Z warnings.warn( 2024-12-18T05:45:51.4185311Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:444: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2024-12-18T05:45:51.4185995Z warnings.warn( 2024-12-18T05:45:51.4186974Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4188160Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.4189412Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1068: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4190520Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4191619Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1083: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4192718Z else FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4193754Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4194793Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4195822Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4196857Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4197893Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4198978Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4200120Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1061: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4201219Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4202306Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1076: FutureWarning: ``FullyShardedDataParallel.scatter_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.scatter_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4203408Z FSDP.scatter_full_optim_state_dict( 2024-12-18T05:45:51.4205018Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.4206593Z warnings.warn( 2024-12-18T05:45:51.4208124Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.4209818Z warnings.warn( 2024-12-18T05:45:51.4209998Z dist init r=1, world=2 2024-12-18T05:45:51.4210186Z dist init r=0, world=2 2024-12-18T05:45:51.4210472Z PASSED [13.2252s] [ 80%] 2024-12-18T05:45:51.4211961Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_False /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4213465Z else osd_method(model1, optim1) 2024-12-18T05:45:51.4214482Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4215562Z else osd_method(model1, optim1) 2024-12-18T05:45:51.4216711Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4217858Z warnings.warn( 2024-12-18T05:45:51.4218965Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4220105Z warnings.warn( 2024-12-18T05:45:51.4221088Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4222143Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.4223200Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4224257Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.4225360Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1049: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4226514Z else FSDP.shard_full_optim_state_dict(fsdp_osd1, model2, optim=optim2) 2024-12-18T05:45:51.4227675Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1049: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4228829Z else FSDP.shard_full_optim_state_dict(fsdp_osd1, model2, optim=optim2) 2024-12-18T05:45:51.4230233Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1057: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4231381Z else FSDP.shard_full_optim_state_dict(fsdp_osd2, model2, optim=optim2) 2024-12-18T05:45:51.4232529Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1057: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4233669Z else FSDP.shard_full_optim_state_dict(fsdp_osd2, model2, optim=optim2) 2024-12-18T05:45:51.4234767Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4235810Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4236608Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4237405Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4238439Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4239478Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4240264Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4241059Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4242087Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4243159Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4243986Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4244814Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4245891Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4246959Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4247791Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4248618Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4249843Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1045: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4251021Z FSDP.shard_full_optim_state_dict( 2024-12-18T05:45:51.4251899Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4252828Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.4253959Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1045: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4255078Z FSDP.shard_full_optim_state_dict( 2024-12-18T05:45:51.4255949Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4256880Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.4258016Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1053: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4259092Z FSDP.shard_full_optim_state_dict( 2024-12-18T05:45:51.4260166Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1053: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4261244Z FSDP.shard_full_optim_state_dict( 2024-12-18T05:45:51.4262840Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.4264415Z warnings.warn( 2024-12-18T05:45:51.4265962Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.4267537Z warnings.warn( 2024-12-18T05:45:51.4267724Z dist init r=1, world=2 2024-12-18T05:45:51.4267916Z dist init r=0, world=2 2024-12-18T05:45:51.4268103Z PASSED [11.5251s] [ 82%] 2024-12-18T05:45:51.4269585Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_False /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4271226Z else osd_method(model1, optim1) 2024-12-18T05:45:51.4272369Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4273401Z else osd_method(model1, optim1) 2024-12-18T05:45:51.4274552Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4275700Z warnings.warn( 2024-12-18T05:45:51.4276800Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4277944Z warnings.warn( 2024-12-18T05:45:51.4278928Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4279984Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.4281042Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4282085Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.4283184Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1049: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4284333Z else FSDP.shard_full_optim_state_dict(fsdp_osd1, model2, optim=optim2) 2024-12-18T05:45:51.4285492Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1049: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4286652Z else FSDP.shard_full_optim_state_dict(fsdp_osd1, model2, optim=optim2) 2024-12-18T05:45:51.4287802Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1057: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4288949Z else FSDP.shard_full_optim_state_dict(fsdp_osd2, model2, optim=optim2) 2024-12-18T05:45:51.4290099Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1057: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4291351Z else FSDP.shard_full_optim_state_dict(fsdp_osd2, model2, optim=optim2) 2024-12-18T05:45:51.4292556Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4293595Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4294390Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4295226Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4296271Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4297310Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4298097Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4298882Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4299913Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4300989Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4301825Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4302662Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4303733Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4304804Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4305629Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4306461Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4307590Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1045: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4308671Z FSDP.shard_full_optim_state_dict( 2024-12-18T05:45:51.4309553Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4310616Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.4311867Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1045: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4312960Z FSDP.shard_full_optim_state_dict( 2024-12-18T05:45:51.4313828Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1338: FutureWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2024-12-18T05:45:51.4314752Z FullyShardedDataParallel._warn_optim_input(optim_input) 2024-12-18T05:45:51.4315888Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1053: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4316967Z FSDP.shard_full_optim_state_dict( 2024-12-18T05:45:51.4318035Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1053: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4319116Z FSDP.shard_full_optim_state_dict( 2024-12-18T05:45:51.4320706Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.4322279Z warnings.warn( 2024-12-18T05:45:51.4323827Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.4325402Z warnings.warn( 2024-12-18T05:45:51.4325578Z dist init r=1, world=2 2024-12-18T05:45:51.4325773Z dist init r=0, world=2 2024-12-18T05:45:51.4325956Z PASSED [11.5259s] [ 85%] 2024-12-18T05:45:51.4327111Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_transformer /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:375: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2024-12-18T05:45:51.4328280Z warnings.warn( 2024-12-18T05:45:51.4329102Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:375: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2024-12-18T05:45:51.4329967Z warnings.warn( 2024-12-18T05:45:51.4330950Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4332168Z else osd_method(model1, optim1) 2024-12-18T05:45:51.4333278Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1013: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4334311Z else osd_method(model1, optim1) 2024-12-18T05:45:51.4335499Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4336649Z warnings.warn( 2024-12-18T05:45:51.4337755Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4338896Z warnings.warn( 2024-12-18T05:45:51.4339540Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:444: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2024-12-18T05:45:51.4340225Z warnings.warn( 2024-12-18T05:45:51.4341207Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1037: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4342263Z else osd_method(model2, optim2, group=new_group) 2024-12-18T05:45:51.4343369Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1049: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4344524Z else FSDP.shard_full_optim_state_dict(fsdp_osd1, model2, optim=optim2) 2024-12-18T05:45:51.4345688Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1057: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4346838Z else FSDP.shard_full_optim_state_dict(fsdp_osd2, model2, optim=optim2) 2024-12-18T05:45:51.4347949Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4348986Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4350022Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1011: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4351056Z osd_method(model1, optim1, optim_input1) 2024-12-18T05:45:51.4352220Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1035: FutureWarning: ``FullyShardedDataParallel.full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4353420Z osd_method(model2, optim2, optim_input2, group=new_group) 2024-12-18T05:45:51.4354553Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1045: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4355638Z FSDP.shard_full_optim_state_dict( 2024-12-18T05:45:51.4356704Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1053: FutureWarning: ``FullyShardedDataParallel.shard_full_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.shard_full_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4357783Z FSDP.shard_full_optim_state_dict( 2024-12-18T05:45:51.4359385Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.4360961Z warnings.warn( 2024-12-18T05:45:51.4362496Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:690: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2024-12-18T05:45:51.4364071Z warnings.warn( 2024-12-18T05:45:51.4364248Z dist init r=1, world=2 2024-12-18T05:45:51.4364440Z dist init r=0, world=2 2024-12-18T05:45:51.4364627Z PASSED [13.2277s] [ 88%] 2024-12-18T05:45:51.4366070Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_unmanaged_params_state_dict_type1_add_to_fsdp_module_True /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1191: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4367576Z fsdp_osd = FSDP.sharded_optim_state_dict(model, optim) 2024-12-18T05:45:51.4368784Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4369922Z warnings.warn( 2024-12-18T05:45:51.4370934Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1191: FutureWarning: ``FullyShardedDataParallel.sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict``. ``FullyShardedDataParallel.sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4372025Z fsdp_osd = FSDP.sharded_optim_state_dict(model, optim) 2024-12-18T05:45:51.4373418Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4374611Z warnings.warn( 2024-12-18T05:45:51.4375676Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1225: FutureWarning: ``FullyShardedDataParallel.flatten_sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.flatten_sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4376863Z FSDP.flatten_sharded_optim_state_dict(fsdp_osd, model, optim=optim) 2024-12-18T05:45:51.4378054Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1225: FutureWarning: ``FullyShardedDataParallel.flatten_sharded_optim_state_dict``is being deprecated and is replaced by ``FullyShardedDataParallel.optim_state_dict_to_load``. ``FullyShardedDataParallel.flatten_sharded_optim_state_dict`` may be removed after PyTorch 2.2. 2024-12-18T05:45:51.4379249Z FSDP.flatten_sharded_optim_state_dict(fsdp_osd, model, optim=optim) 2024-12-18T05:45:51.4379949Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.4380620Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.4381280Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.4381972Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.4382679Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:52: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.4383330Z dim_0_size = sharded_tensor.size()[0] # type: ignore[index] 2024-12-18T05:45:51.4383987Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.4384687Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.4385377Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:53: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.4386064Z tensor_numel = sharded_tensor.size().numel() # type: ignore[union-attr] 2024-12-18T05:45:51.4386759Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_state_dict_utils.py:77: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2024-12-18T05:45:51.4387446Z tensor = tensor.narrow(0, 0, tensor_numel).reshape(sharded_tensor.size()) 2024-12-18T05:45:51.4387752Z dist init r=0, world=2 2024-12-18T05:45:51.4387941Z dist init r=1, world=2 2024-12-18T05:45:51.4388132Z PASSED [11.2237s] [ 91%] 2024-12-18T05:45:51.4389554Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_state_dict_with_none_tensor_state /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4390983Z warnings.warn( 2024-12-18T05:45:51.4392087Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4393363Z warnings.warn( 2024-12-18T05:45:51.4393653Z dist init r=1, world=2 2024-12-18T05:45:51.4393840Z dist init r=0, world=2 2024-12-18T05:45:51.4394023Z PASSED [11.2222s] [ 94%] 2024-12-18T05:45:51.4395407Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_with_empty_optimizer_state /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4396828Z warnings.warn( 2024-12-18T05:45:51.4397928Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4399067Z warnings.warn( 2024-12-18T05:45:51.4399240Z dist init r=0, world=2 2024-12-18T05:45:51.4399429Z dist init r=1, world=2 2024-12-18T05:45:51.4399612Z PASSED [6.4140s] [ 97%] 2024-12-18T05:45:51.4400508Z distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_with_no_shard /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1953: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2024-12-18T05:45:51.4401432Z model = FSDP( 2024-12-18T05:45:51.4402083Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py:1953: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2024-12-18T05:45:51.4402774Z model = FSDP( 2024-12-18T05:45:51.4403786Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4403857Z warnings.warn( 2024-12-18T05:45:51.4404856Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:863: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2024-12-18T05:45:51.4404936Z warnings.warn( 2024-12-18T05:45:51.4405010Z dist init r=0, world=2 2024-12-18T05:45:51.4405085Z dist init r=1, world=2 2024-12-18T05:45:51.4405160Z PASSED [11.2235s] [100%] 2024-12-18T05:45:51.4405164Z 2024-12-18T05:45:51.4405270Z ==================================== RERUNS ==================================== 2024-12-18T05:45:51.4405637Z _ TestFSDPOptimState.test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False _ 2024-12-18T05:45:51.4405727Z Traceback (most recent call last): 2024-12-18T05:45:51.4406057Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 597, in wrapper 2024-12-18T05:45:51.4406142Z self._join_processes(fn) 2024-12-18T05:45:51.4406493Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 837, in _join_processes 2024-12-18T05:45:51.4406709Z self._check_return_codes(elapsed_time) 2024-12-18T05:45:51.4407088Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 886, in _check_return_codes 2024-12-18T05:45:51.4407272Z raise RuntimeError(error) 2024-12-18T05:45:51.4407436Z RuntimeError: Process 1 exited with error code 10 and exception: 2024-12-18T05:45:51.4407518Z Traceback (most recent call last): 2024-12-18T05:45:51.4407856Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 726, in run_test 2024-12-18T05:45:51.4407935Z getattr(self, test_name)() 2024-12-18T05:45:51.4408260Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 599, in wrapper 2024-12-18T05:45:51.4408326Z fn() 2024-12-18T05:45:51.4408656Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3108, in wrapper 2024-12-18T05:45:51.4408732Z method(*args, **kwargs) 2024-12-18T05:45:51.4409071Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 557, in instantiated_test 2024-12-18T05:45:51.4409157Z test(self, **param_kwargs) 2024-12-18T05:45:51.4409481Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 199, in wrapper 2024-12-18T05:45:51.4409559Z return func(*args, **kwargs) 2024-12-18T05:45:51.4409891Z File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py", line 539, in test_optim_state_dict_nested 2024-12-18T05:45:51.4409966Z self.run_subtests( 2024-12-18T05:45:51.4410284Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1186, in run_subtests 2024-12-18T05:45:51.4410390Z return run_subtests(self, *args, **kwargs) 2024-12-18T05:45:51.4410723Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 993, in run_subtests 2024-12-18T05:45:51.4410841Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2024-12-18T05:45:51.4411168Z File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py", line 594, in _test_optim_state_dict_nested 2024-12-18T05:45:51.4411309Z assert l1 == l2, f"Losses differ on iter {i}: {l1:.5f} {l2:.5f}" 2024-12-18T05:45:51.4411447Z AssertionError: Losses differ on iter 0: -0.37255 190.85599 2024-12-18T05:45:51.4411452Z 2024-12-18T05:45:51.4411600Z To execute this test, run the following from the base repo dir: 2024-12-18T05:45:51.4412183Z PYTORCH_TEST_WITH_ROCM=1 python test/distributed/fsdp/test_fsdp_optim_state.py TestFSDPOptimState.test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False 2024-12-18T05:45:51.4412190Z 2024-12-18T05:45:51.4412369Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2024-12-18T05:45:51.4412373Z 2024-12-18T05:45:51.4412375Z 2024-12-18T05:45:51.4412526Z ----------------------------- Captured stdout call ----------------------------- 2024-12-18T05:45:51.4412697Z Process 1 terminated with exit code 10, terminating remaining processes. 2024-12-18T05:45:51.4413066Z _ TestFSDPOptimState.test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False _ 2024-12-18T05:45:51.4413151Z Traceback (most recent call last): 2024-12-18T05:45:51.4413483Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 597, in wrapper 2024-12-18T05:45:51.4413562Z self._join_processes(fn) 2024-12-18T05:45:51.4413912Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 837, in _join_processes 2024-12-18T05:45:51.4414130Z self._check_return_codes(elapsed_time) 2024-12-18T05:45:51.4414495Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 886, in _check_return_codes 2024-12-18T05:45:51.4414615Z raise RuntimeError(error) 2024-12-18T05:45:51.4414891Z RuntimeError: Process 0 exited with error code 10 and exception: 2024-12-18T05:45:51.4414982Z Traceback (most recent call last): 2024-12-18T05:45:51.4415306Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 726, in run_test 2024-12-18T05:45:51.4415392Z getattr(self, test_name)() 2024-12-18T05:45:51.4415709Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 599, in wrapper 2024-12-18T05:45:51.4415781Z fn() 2024-12-18T05:45:51.4416083Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3108, in wrapper 2024-12-18T05:45:51.4416167Z method(*args, **kwargs) 2024-12-18T05:45:51.4416499Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 557, in instantiated_test 2024-12-18T05:45:51.4416580Z test(self, **param_kwargs) 2024-12-18T05:45:51.4416897Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 199, in wrapper 2024-12-18T05:45:51.4416980Z return func(*args, **kwargs) 2024-12-18T05:45:51.4417303Z File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py", line 539, in test_optim_state_dict_nested 2024-12-18T05:45:51.4417385Z self.run_subtests( 2024-12-18T05:45:51.4417699Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1186, in run_subtests 2024-12-18T05:45:51.4417800Z return run_subtests(self, *args, **kwargs) 2024-12-18T05:45:51.4418140Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 993, in run_subtests 2024-12-18T05:45:51.4418254Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2024-12-18T05:45:51.4418582Z File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py", line 594, in _test_optim_state_dict_nested 2024-12-18T05:45:51.4418720Z assert l1 == l2, f"Losses differ on iter {i}: {l1:.5f} {l2:.5f}" 2024-12-18T05:45:51.4418860Z AssertionError: Losses differ on iter 0: -0.37255 190.85599 2024-12-18T05:45:51.4418863Z 2024-12-18T05:45:51.4419000Z To execute this test, run the following from the base repo dir: 2024-12-18T05:45:51.4419582Z PYTORCH_TEST_WITH_ROCM=1 python test/distributed/fsdp/test_fsdp_optim_state.py TestFSDPOptimState.test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False 2024-12-18T05:45:51.4419586Z 2024-12-18T05:45:51.4419757Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2024-12-18T05:45:51.4419770Z 2024-12-18T05:45:51.4419773Z 2024-12-18T05:45:51.4419915Z ----------------------------- Captured stdout call ----------------------------- 2024-12-18T05:45:51.4420085Z Process 1 terminated with exit code 10, terminating remaining processes. 2024-12-18T05:45:51.4420223Z ----------------------------- Captured stdout call ----------------------------- 2024-12-18T05:45:51.4420394Z Process 0 terminated with exit code 10, terminating remaining processes. 2024-12-18T05:45:51.4420764Z _ TestFSDPOptimState.test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_False _ 2024-12-18T05:45:51.4420853Z Traceback (most recent call last): 2024-12-18T05:45:51.4421175Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 597, in wrapper 2024-12-18T05:45:51.4421260Z self._join_processes(fn) 2024-12-18T05:45:51.4421607Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 837, in _join_processes 2024-12-18T05:45:51.4421836Z self._check_return_codes(elapsed_time) 2024-12-18T05:45:51.4422197Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 886, in _check_return_codes 2024-12-18T05:45:51.4422283Z raise RuntimeError(error) 2024-12-18T05:45:51.4422532Z RuntimeError: Process 1 exited with error code 10 and exception: 2024-12-18T05:45:51.4422616Z Traceback (most recent call last): 2024-12-18T05:45:51.4422940Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 726, in run_test 2024-12-18T05:45:51.4423018Z getattr(self, test_name)() 2024-12-18T05:45:51.4423339Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 599, in wrapper 2024-12-18T05:45:51.4423405Z fn() 2024-12-18T05:45:51.4423710Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3108, in wrapper 2024-12-18T05:45:51.4423787Z method(*args, **kwargs) 2024-12-18T05:45:51.4424123Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 557, in instantiated_test 2024-12-18T05:45:51.4424200Z test(self, **param_kwargs) 2024-12-18T05:45:51.4424525Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 199, in wrapper 2024-12-18T05:45:51.4424606Z return func(*args, **kwargs) 2024-12-18T05:45:51.4424936Z File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py", line 539, in test_optim_state_dict_nested 2024-12-18T05:45:51.4425015Z self.run_subtests( 2024-12-18T05:45:51.4425328Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1186, in run_subtests 2024-12-18T05:45:51.4425428Z return run_subtests(self, *args, **kwargs) 2024-12-18T05:45:51.4425768Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 993, in run_subtests 2024-12-18T05:45:51.4425885Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2024-12-18T05:45:51.4426210Z File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_optim_state.py", line 594, in _test_optim_state_dict_nested 2024-12-18T05:45:51.4426348Z assert l1 == l2, f"Losses differ on iter {i}: {l1:.5f} {l2:.5f}" 2024-12-18T05:45:51.4426481Z AssertionError: Losses differ on iter 0: -0.37255 190.85599 2024-12-18T05:45:51.4426484Z 2024-12-18T05:45:51.4426634Z To execute this test, run the following from the base repo dir: 2024-12-18T05:45:51.4427212Z PYTORCH_TEST_WITH_ROCM=1 python test/distributed/fsdp/test_fsdp_optim_state.py TestFSDPOptimState.test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_False 2024-12-18T05:45:51.4427222Z 2024-12-18T05:45:51.4427393Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2024-12-18T05:45:51.4427397Z 2024-12-18T05:45:51.4427399Z 2024-12-18T05:45:51.4427543Z ----------------------------- Captured stdout call ----------------------------- 2024-12-18T05:45:51.4427708Z Process 1 terminated with exit code 10, terminating remaining processes. 2024-12-18T05:45:51.4428232Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_optim_state/distributed.fsdp.test_fsdp_optim_state-cf12ea386aa05c69.xml - 2024-12-18T05:45:51.4428354Z =================== 35 passed, 3 rerun in 412.32s (0:06:52) ==================== 2024-12-18T05:45:51.4428358Z 2024-12-18T05:45:51.4428772Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_optim_state 1/2 (test/test-reports/distributed.fsdp.test_fsdp_optim_state_1.2_0a1ae362c4d810ea_.log) 2024-12-18T05:45:51.4428775Z 2024-12-18T05:45:51.4428975Z Running distributed/checkpoint/test_checkpoint 1/1 ... [2024-12-18 05:45:51.321161] 2024-12-18T05:45:51.4429177Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:45:51.4429919Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_checkpoint.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:45:51.321711] 2024-12-18T05:46:39.2343597Z 2024-12-18T05:46:39.2349785Z distributed/checkpoint/test_checkpoint 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_checkpoint_1.1_ec62fbe9a8f60cea_.log 2024-12-18T05:46:39.2356045Z Running 8 items in this shard: test/distributed/checkpoint/test_checkpoint.py::TestDistributedCheckpointing::test_default_metadata, test/distributed/checkpoint/test_checkpoint.py::TestDistributedCheckpointing::test_tensor_metadata_with_missing_rank_spec, test/distributed/checkpoint/test_checkpoint.py::TestDistributedFailure::test_dummy_reader_works, test/distributed/checkpoint/test_checkpoint.py::TestDistributedFailure::test_dummy_writer_works, test/distributed/checkpoint/test_checkpoint.py::TestDistributedFailure::test_load_error_handling, test/distributed/checkpoint/test_checkpoint.py::TestDistributedFailure::test_load_error_handling_no_dist, test/distributed/checkpoint/test_checkpoint.py::TestDistributedFailure::test_save_error_handling, test/distributed/checkpoint/test_checkpoint.py::TestDistributedFailure::test_save_error_handling_no_dist 2024-12-18T05:46:39.2359228Z 2024-12-18T05:46:39.2359480Z Running distributed/test_c10d_object_collectives 1/1 ... [2024-12-18 05:46:39.234722] 2024-12-18T05:46:39.2359938Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:46:39.2360938Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_c10d_object_collectives.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:46:39.235037] 2024-12-18T05:47:41.4336123Z 2024-12-18T05:47:41.4337925Z distributed/test_c10d_object_collectives 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_object_collectives_1.1_a5f6d3b2114716a2_.log 2024-12-18T05:47:41.4346828Z Running 9 items in this shard: test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_all_gather_object, test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_broadcast_object_list, test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_gather_object, test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_scatter_object_list, test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_send_recv_object_list, test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_subpg_all_gather_object, test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_subpg_broadcast_object, test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_subpg_gather_object, test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_subpg_scatter_object 2024-12-18T05:47:41.4353368Z 2024-12-18T05:47:41.4353773Z Running distributed/test_c10d_pypg 1/1 ... [2024-12-18 05:47:41.433643] 2024-12-18T05:47:41.4354575Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:47:41.4356425Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_c10d_pypg.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:47:41.434193] 2024-12-18T05:47:54.4781648Z 2024-12-18T05:47:54.4783550Z distributed/test_c10d_pypg 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_pypg_1.1_410d19dbb261024b_.log 2024-12-18T05:47:54.4809622Z Running 43 items in this shard: test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_dataclass_output, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_dataclass_output_unused_param, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_dynamic_module, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_dynamic_weight_sharing, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_once_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_once_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_static_graph_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_static_graph_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_weight_sharing, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_unused_params_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_unused_params_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_weight_sharing_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_weight_sharing_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_invoke_work_object, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_with_pypg, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_with_pypg_with_grad_views, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_invalid_powerSGD_state, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_sync_batch_norm_empty_input, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_sync_batch_norm_only_empty_input, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_dataclass_output, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_dataclass_output_unused_param, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_dynamic_module, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_dynamic_weight_sharing, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_once_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_once_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_static_graph_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_static_graph_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_weight_sharing, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_unused_params_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_unused_params_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_weight_sharing_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_weight_sharing_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_invoke_work_object, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_with_pypg, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_with_pypg_with_grad_views, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_invalid_powerSGD_state, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_sync_batch_norm_empty_input, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_sync_batch_norm_only_empty_input, test/distributed/test_c10d_pypg.py::TestPyProcessGroup::test_attr_overrides 2024-12-18T05:47:54.4834727Z 2024-12-18T05:47:54.4835190Z Running distributed/tensor/parallel/test_parallelize_api 1/1 ... [2024-12-18 05:47:54.478081] 2024-12-18T05:47:54.4835694Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:47:54.4836740Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/parallel/test_parallelize_api.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:47:54.478675] 2024-12-18T05:49:02.5770810Z 2024-12-18T05:49:02.5772572Z distributed/tensor/parallel/test_parallelize_api 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.parallel.test_parallelize_api_1.1_f87983e17ab785f6_.log 2024-12-18T05:49:02.5783323Z Running 12 items in this shard: test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_empty_plan, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_linear_col_wise_parallel, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_linear_row_wise_parallel, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_parallelize_mlp_with_module_api, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_parallelize_mlp_with_module_api_nested, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_parallelize_module_multi_wildcard, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_parallelize_module_with_digit, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_parallelize_module_with_question, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_parallelize_module_with_star, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_prepare_module_input, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_prepare_module_output, test/distributed/tensor/parallel/test_parallelize_api.py::TensorParallelAPITests::test_under_devicemesh_context 2024-12-18T05:49:02.5787767Z 2024-12-18T05:49:02.5787967Z Running distributed/fsdp/test_fsdp_traversal 1/1 ... [2024-12-18 05:49:02.577369] 2024-12-18T05:49:02.5788327Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:49:02.5789119Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_traversal.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:49:02.577924] 2024-12-18T05:49:14.5184671Z 2024-12-18T05:49:14.5186666Z distributed/fsdp/test_fsdp_traversal 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_traversal_1.1_2eb60070354a8b16_.log 2024-12-18T05:49:14.5188940Z Running 1 items in this shard: test/distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda 2024-12-18T05:49:14.5189837Z 2024-12-18T05:49:14.5192315Z Running distributed/checkpoint/test_state_dict 1/1 ... [2024-12-18 05:49:14.518684] 2024-12-18T05:49:14.5192860Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:49:14.5197492Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_state_dict.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:49:14.519313] 2024-12-18T05:52:37.0576607Z 2024-12-18T05:52:37.0582371Z distributed/checkpoint/test_state_dict 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_state_dict_1.1_204271d0ba995f2f_.log 2024-12-18T05:52:37.0593449Z Running 23 items in this shard: test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_activation_ckpt_fqns_ddp, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_activation_ckpt_fqns_fsdp1, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_broadcast_from_rank0, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_broadcast_from_rank0_hsdp, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_compiled_fsdp, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_cpu_offload_full_state_dict, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_ddp, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_deprecate_fsdp_api, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_deprecate_partial, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_extra_state, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_flattened_osd, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_fsdp, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_fsdp2, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_fsdp_ddp, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_fsdp_root_not_initialized, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_non_persistent_buffers, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_optim_state_dict_param_matching, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_setting_meta_device_model, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_setting_meta_device_model_broadcasting, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_shared_weight, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_single_gpu, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_strict, test/distributed/checkpoint/test_state_dict.py::TestNoComm::test_no_dist 2024-12-18T05:52:37.0600903Z 2024-12-18T05:52:37.0604700Z Running distributed/algorithms/ddp_comm_hooks/test_ddp_hooks 1/1 ... [2024-12-18 05:52:37.060154] 2024-12-18T05:52:37.0605236Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:52:37.0610374Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/algorithms/ddp_comm_hooks/test_ddp_hooks.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:52:37.060691] 2024-12-18T05:53:25.0289823Z 2024-12-18T05:53:25.0291778Z distributed/algorithms/ddp_comm_hooks/test_ddp_hooks 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.algorithms.ddp_comm_hooks.test_ddp_hooks_1.1_47699e7bd56016a3_.log 2024-12-18T05:53:25.0302319Z Running 6 items in this shard: test/distributed/algorithms/ddp_comm_hooks/test_ddp_hooks.py::DistributedDataParallelCommHookTest::test_ddp_comm_hook_allreduce_hook, test/distributed/algorithms/ddp_comm_hooks/test_ddp_hooks.py::DistributedDataParallelCommHookTest::test_ddp_comm_hook_fp16compress_hook, test/distributed/algorithms/ddp_comm_hooks/test_ddp_hooks.py::DistributedDataParallelCommHookTest::test_ddp_comm_hook_noop_hook, test/distributed/algorithms/ddp_comm_hooks/test_ddp_hooks.py::DistributedDataParallelCommHookTest::test_ddp_comm_hook_quantize_per_channel_hook, test/distributed/algorithms/ddp_comm_hooks/test_ddp_hooks.py::DistributedDataParallelCommHookTest::test_ddp_comm_hook_quantize_per_tensor_hook, test/distributed/algorithms/ddp_comm_hooks/test_ddp_hooks.py::DistributedDataParallelCommHookTest::test_is_last_hook 2024-12-18T05:53:25.0305463Z 2024-12-18T05:53:25.0305736Z Running distributed/fsdp/test_fsdp_exec_order 1/1 ... [2024-12-18 05:53:25.029258] 2024-12-18T05:53:25.0306713Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:53:25.0307915Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_exec_order.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:53:25.029799] 2024-12-18T05:55:04.7577370Z 2024-12-18T05:55:04.7582642Z distributed/fsdp/test_fsdp_exec_order 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_exec_order_1.1_39a490c6aa708eb3_.log 2024-12-18T05:55:04.7591819Z Running 8 items in this shard: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda 2024-12-18T05:55:04.7595942Z 2024-12-18T05:55:04.7596248Z Running distributed/_composable/fsdp/test_fully_shard_memory 1/1 ... [2024-12-18 05:55:04.757940] 2024-12-18T05:55:04.7596753Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:55:04.7597814Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_memory.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:55:04.758564] 2024-12-18T05:55:31.2255162Z 2024-12-18T05:55:31.2257142Z distributed/_composable/fsdp/test_fully_shard_memory 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_memory_1.1_7e2ca7a1fcb2e02c_.log 2024-12-18T05:55:31.2260493Z Running 2 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_memory.py::TestFullyShardMemory::test_fully_shard_del_memory, test/distributed/_composable/fsdp/test_fully_shard_memory.py::TestFullyShardMemory::test_fully_shard_training_memory 2024-12-18T05:55:31.2262449Z 2024-12-18T05:55:31.2262972Z Running distributed/fsdp/test_checkpoint_wrapper 1/1 ... [2024-12-18 05:55:31.225752] 2024-12-18T05:55:31.2263882Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:55:31.2270149Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_checkpoint_wrapper.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:55:31.226298] 2024-12-18T05:55:42.0649757Z 2024-12-18T05:55:42.0651585Z distributed/fsdp/test_checkpoint_wrapper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_checkpoint_wrapper_1.1_2703add83b6081e9_.log 2024-12-18T05:55:42.0660980Z Running 8 items in this shard: test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_apply_activation_checkpointing, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_checkpoint_wrapper_args_kwargs, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_checkpoint_wrapper_cpu_offload, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_checkpoint_wrapper_kwarg_support, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_checkpoint_wrapper_parity, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_forward_missing_attributes, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_fqn, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_load_activation_checkpointed_module 2024-12-18T05:55:42.0664545Z 2024-12-18T05:55:42.0664777Z Running distributed/fsdp/test_utils 1/1 ... [2024-12-18 05:55:42.065165] 2024-12-18T05:55:42.0665195Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:55:42.0666517Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:55:42.065715] 2024-12-18T05:55:47.6933095Z 2024-12-18T05:55:47.6935299Z distributed/fsdp/test_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_utils_1.1_82c42eeaac174e13_.log 2024-12-18T05:55:47.6939773Z Running 5 items in this shard: test/distributed/fsdp/test_utils.py::TestUtilsCUDA::test_apply_to_tensors_cpu_cuda_cuda, test/distributed/fsdp/test_utils.py::TestUtilsCUDA::test_apply_to_tensors_device_list0_cuda, test/distributed/fsdp/test_utils.py::TestUtilsCUDA::test_apply_to_tensors_device_list1_cuda, test/distributed/fsdp/test_utils.py::TestUtilsCUDA::test_packed_sequence_cuda, test/distributed/fsdp/test_utils.py::TestUtilsCUDA::test_replace_by_prefix_cuda 2024-12-18T05:55:47.6943574Z 2024-12-18T05:55:47.6944226Z Running distributed/fsdp/test_hsdp_dtensor_state_dict 1/1 ... [2024-12-18 05:55:47.693516] 2024-12-18T05:55:47.6945226Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:55:47.6946312Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_hsdp_dtensor_state_dict.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:55:47.694032] 2024-12-18T05:56:39.0638688Z 2024-12-18T05:56:39.0640302Z distributed/fsdp/test_hsdp_dtensor_state_dict 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_hsdp_dtensor_state_dict_1.1_1e090f0eac9ecf51_.log 2024-12-18T05:56:39.0647253Z Running 8 items in this shard: test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda, test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda, test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda, test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda, test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda, test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda, test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_hsdp_init_with_device_mesh_cuda, test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_root_module_is_not_FSDP_cuda 2024-12-18T05:56:39.0651683Z 2024-12-18T05:56:39.0651906Z Running distributed/fsdp/test_fsdp_tp_integration 1/1 ... [2024-12-18 05:56:39.063806] 2024-12-18T05:56:39.0652287Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:56:39.0653580Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_tp_integration.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:56:39.064121] 2024-12-18T05:57:03.0813671Z 2024-12-18T05:57:03.0823914Z distributed/fsdp/test_fsdp_tp_integration 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_tp_integration_1.1_2e672b10ed0b3c46_.log 2024-12-18T05:57:03.0827747Z Running 3 items in this shard: test/distributed/fsdp/test_fsdp_tp_integration.py::TestTPFSDPIntegration::test_fsdp_tp_extension_grad, test/distributed/fsdp/test_fsdp_tp_integration.py::TestTPFSDPIntegration::test_fsdp_tp_integration, test/distributed/fsdp/test_fsdp_tp_integration.py::TestTPFSDPIntegration::test_fsdp_tp_sync_module_state 2024-12-18T05:57:03.0830314Z 2024-12-18T05:57:03.0830882Z Running distributed/fsdp/test_fsdp_checkpoint 1/1 ... [2024-12-18 05:57:03.081562] 2024-12-18T05:57:03.0831937Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T05:57:03.0834262Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_checkpoint.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 05:57:03.082098] 2024-12-18T06:00:24.5694098Z 2024-12-18T06:00:24.5695876Z distributed/fsdp/test_fsdp_checkpoint 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_checkpoint_1.1_ade47d7aed855f76_.log 2024-12-18T06:00:24.5712832Z Running 17 items in this shard: test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda 2024-12-18T06:00:24.5727025Z 2024-12-18T06:00:24.5727647Z Running distributed/_composable/fsdp/test_fully_shard_training 1/1 ... [2024-12-18 06:00:24.569633] 2024-12-18T06:00:24.5728662Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T06:00:24.5730736Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_training.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 06:00:24.570163] 2024-12-18T06:10:10.4253414Z 2024-12-18T06:10:10.4259150Z distributed/_composable/fsdp/test_fully_shard_training 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_training_1.1_8953c1fd03eb0684_.log 2024-12-18T06:10:10.4362904Z Running 24 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_training.py::test_compiled_fsdp, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardForwardInputs::test_root_move_forward_input_to_device, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardRegisteredParams::test_param_registration_after_backward, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardRegisteredParams::test_param_registration_after_forward, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardCastAfterInit::test_to_float64_after_init, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_explicit_prefetching, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_multi_forward_module, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_non_root_forward_backward, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_post_optim_event, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_train_parity_multi_group, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_train_parity_multi_group_cpu_offload_eager, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_train_parity_multi_group_unshard_async_op, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_train_parity_single_group_shard_dim0, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_train_parity_single_group_shard_largest_dim, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCompose::test_train_parity_with_activation_checkpointing, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardShardPlacementFnMultiProcess::test_train_parity_shard_placement_fn_shard_largest_dim, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardShardPlacementFnMultiThread::test_shard_placement_fn_contiguous_params_grads, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardSharedParams::test_train_parity_with_shared_params, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardGradientAccumulation::test_1f1b_microbatching, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardGradientAccumulation::test_gradient_accumulation, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardNDTraining::test_2d_mlp_with_nd_mesh, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardHSDP3DTraining::test_3d_mlp_with_nd_mesh, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardHSDPTraining::test_train_parity_hsdp, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardCustomForwardMethod::test_register_fsdp_forward_method 2024-12-18T06:10:10.4779243Z 2024-12-18T06:10:10.4785030Z Running distributed/fsdp/test_fsdp_core 3/3 ... [2024-12-18 06:10:10.477905] 2024-12-18T06:10:10.4785976Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T06:10:10.4789380Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_core.py', '--shard-id=3', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 06:10:10.478243] 2024-12-18T06:18:33.6777391Z 2024-12-18T06:18:33.6779184Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_core 3/3 (test/test-reports/distributed.fsdp.test_fsdp_core_3.3_87a7fda532bc3014_.log) 2024-12-18T06:18:33.6781293Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4661ad44f709fc51.xml 2024-12-18T06:18:33.6782750Z ============================= test session starts ============================== 2024-12-18T06:18:33.6783905Z platform linux -- Python 3.10.15, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-12-18T06:18:33.6785069Z cachedir: .pytest_cache 2024-12-18T06:18:33.6786373Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-12-18T06:18:33.6787117Z rootdir: /var/lib/jenkins/pytorch 2024-12-18T06:18:33.6787403Z configfile: pytest.ini 2024-12-18T06:18:33.6787957Z plugins: xdist-3.3.1, hypothesis-5.35.1, cpp-2.3.0, subtests-0.13.1, rerunfailures-14.0, flakefinder-1.1.0, xdoctest-1.1.0, typeguard-4.3.0 2024-12-18T06:18:33.6788555Z collecting ... collected 60 items 2024-12-18T06:18:33.6788887Z stepcurrent: Cannot find last run test, not skipping 2024-12-18T06:18:33.6798521Z Running 21 items in this shard: test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_True_cuda, test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_False_cuda, test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_True_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_False_cuda, test/distributed/fsdp/test_fsdp_core.py::TestAutogradCUDA::test_unshard_params_as_tensors_cuda 2024-12-18T06:18:33.6808048Z 2024-12-18T06:18:33.6809404Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_True_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:375: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2024-12-18T06:18:33.6810939Z warnings.warn( 2024-12-18T06:18:33.6812072Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:118: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2024-12-18T06:18:33.6813178Z {} 2024-12-18T06:18:33.6813682Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2024-12-18T06:18:33.6814120Z warnings.warn( 2024-12-18T06:18:33.6815587Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.6817218Z warnings.warn( 2024-12-18T06:18:33.6818400Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:375: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2024-12-18T06:18:33.6819496Z warnings.warn( 2024-12-18T06:18:33.6820348Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:118: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2024-12-18T06:18:33.6821301Z {} 2024-12-18T06:18:33.6821959Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2024-12-18T06:18:33.6822792Z warnings.warn( 2024-12-18T06:18:33.6825463Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.6828379Z warnings.warn( 2024-12-18T06:18:33.6828652Z dist init r=1, world=2 2024-12-18T06:18:33.6828928Z dist init r=0, world=2 2024-12-18T06:18:33.6829204Z PASSED [11.5281s] [ 4%] 2024-12-18T06:18:33.6830720Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_False_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:375: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2024-12-18T06:18:33.6832231Z warnings.warn( 2024-12-18T06:18:33.6833624Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.6835361Z warnings.warn( 2024-12-18T06:18:33.6837334Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:375: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2024-12-18T06:18:33.6839462Z warnings.warn( 2024-12-18T06:18:33.6842146Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.6844914Z warnings.warn( 2024-12-18T06:18:33.6845328Z dist init r=1, world=2 2024-12-18T06:18:33.6845784Z dist init r=0, world=2 2024-12-18T06:18:33.6846238Z PASSED [11.4246s] [ 9%] 2024-12-18T06:18:33.6849123Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_True_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:375: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2024-12-18T06:18:33.6852053Z warnings.warn( 2024-12-18T06:18:33.6853719Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:118: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2024-12-18T06:18:33.6855728Z {} 2024-12-18T06:18:33.6856683Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2024-12-18T06:18:33.6857535Z warnings.warn( 2024-12-18T06:18:33.6860217Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.6862976Z warnings.warn( 2024-12-18T06:18:33.6864976Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:375: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2024-12-18T06:18:33.6866432Z warnings.warn( 2024-12-18T06:18:33.6867287Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:118: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2024-12-18T06:18:33.6868256Z {} 2024-12-18T06:18:33.6868742Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2024-12-18T06:18:33.6869175Z warnings.warn( 2024-12-18T06:18:33.6870540Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.6872117Z warnings.warn( 2024-12-18T06:18:33.6872335Z dist init r=0, world=2 2024-12-18T06:18:33.6872566Z dist init r=1, world=2 2024-12-18T06:18:33.6872797Z PASSED [11.5253s] [ 14%] 2024-12-18T06:18:33.6874973Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_none_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.6878815Z warnings.warn( 2024-12-18T06:18:33.6881925Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.6885097Z warnings.warn( 2024-12-18T06:18:33.6885509Z dist init r=0, world=2 2024-12-18T06:18:33.6885958Z dist init r=1, world=2 2024-12-18T06:18:33.6886404Z PASSED [61.1066s] [ 19%] 2024-12-18T06:18:33.6890384Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_shard_grad_op_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.6894377Z warnings.warn( 2024-12-18T06:18:33.6896096Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:444: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2024-12-18T06:18:33.6897802Z warnings.warn( 2024-12-18T06:18:33.6900481Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.6903251Z warnings.warn( 2024-12-18T06:18:33.6906053Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.6907680Z warnings.warn( 2024-12-18T06:18:33.6908493Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:444: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2024-12-18T06:18:33.6909356Z warnings.warn( 2024-12-18T06:18:33.6910724Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.6912307Z warnings.warn( 2024-12-18T06:18:33.6913843Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.6915315Z warnings.warn( 2024-12-18T06:18:33.6916885Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:773: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.6918460Z warnings.warn( 2024-12-18T06:18:33.6919924Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:773: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.6921487Z warnings.warn( 2024-12-18T06:18:33.6922933Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:711: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.6924488Z warnings.warn( 2024-12-18T06:18:33.6925937Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:711: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.6927494Z warnings.warn( 2024-12-18T06:18:33.6928944Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:827: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.6930507Z warnings.warn( 2024-12-18T06:18:33.6931951Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:827: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.6933504Z warnings.warn( 2024-12-18T06:18:33.6935064Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:864: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.6936620Z warnings.warn( 2024-12-18T06:18:33.6938061Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:864: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.6939616Z warnings.warn( 2024-12-18T06:18:33.6940030Z dist init r=0, world=2 2024-12-18T06:18:33.6940471Z dist init r=1, world=2 2024-12-18T06:18:33.6940916Z PASSED [23.4486s] [ 23%] 2024-12-18T06:18:33.6944849Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_shard_grad_op_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.6947342Z warnings.warn( 2024-12-18T06:18:33.6948155Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:444: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2024-12-18T06:18:33.6949027Z warnings.warn( 2024-12-18T06:18:33.6950393Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.6952108Z warnings.warn( 2024-12-18T06:18:33.6953690Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.6955379Z warnings.warn( 2024-12-18T06:18:33.6957101Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:444: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2024-12-18T06:18:33.6958800Z warnings.warn( 2024-12-18T06:18:33.6961490Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.6964252Z warnings.warn( 2024-12-18T06:18:33.6966911Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.6969668Z warnings.warn( 2024-12-18T06:18:33.6971123Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:773: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.6972696Z warnings.warn( 2024-12-18T06:18:33.6974139Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:711: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.6975776Z warnings.warn( 2024-12-18T06:18:33.6977220Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:773: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.6978785Z warnings.warn( 2024-12-18T06:18:33.6980234Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:711: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.6981791Z warnings.warn( 2024-12-18T06:18:33.6983239Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:827: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.6984805Z warnings.warn( 2024-12-18T06:18:33.6986157Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:827: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.6986951Z warnings.warn( 2024-12-18T06:18:33.6987690Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:864: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.6988488Z warnings.warn( 2024-12-18T06:18:33.6998129Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:864: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.6999688Z warnings.warn( 2024-12-18T06:18:33.7000099Z dist init r=1, world=2 2024-12-18T06:18:33.7000543Z dist init r=0, world=2 2024-12-18T06:18:33.7001280Z PASSED [23.5514s] [ 28%] 2024-12-18T06:18:33.7005344Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7009407Z warnings.warn( 2024-12-18T06:18:33.7010971Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:444: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2024-12-18T06:18:33.7012633Z warnings.warn( 2024-12-18T06:18:33.7015405Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.7018178Z warnings.warn( 2024-12-18T06:18:33.7021251Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7024426Z warnings.warn( 2024-12-18T06:18:33.7025967Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:444: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2024-12-18T06:18:33.7026863Z warnings.warn( 2024-12-18T06:18:33.7028228Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.7029639Z warnings.warn( 2024-12-18T06:18:33.7031001Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.7032430Z warnings.warn( 2024-12-18T06:18:33.7033200Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:773: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7034006Z warnings.warn( 2024-12-18T06:18:33.7034752Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:773: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7036159Z warnings.warn( 2024-12-18T06:18:33.7037597Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:711: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7039152Z warnings.warn( 2024-12-18T06:18:33.7040865Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:711: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7042420Z warnings.warn( 2024-12-18T06:18:33.7043855Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:827: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7045407Z warnings.warn( 2024-12-18T06:18:33.7046843Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:827: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7048400Z warnings.warn( 2024-12-18T06:18:33.7049840Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:864: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7051382Z warnings.warn( 2024-12-18T06:18:33.7052824Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:864: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7054379Z warnings.warn( 2024-12-18T06:18:33.7054893Z dist init r=1, world=2 2024-12-18T06:18:33.7055337Z dist init r=0, world=2 2024-12-18T06:18:33.7055780Z PASSED [47.6899s] [ 33%] 2024-12-18T06:18:33.7057947Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7060043Z warnings.warn( 2024-12-18T06:18:33.7060853Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:444: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2024-12-18T06:18:33.7061714Z warnings.warn( 2024-12-18T06:18:33.7063082Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.7064499Z warnings.warn( 2024-12-18T06:18:33.7066078Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7067696Z warnings.warn( 2024-12-18T06:18:33.7068512Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:444: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2024-12-18T06:18:33.7069579Z warnings.warn( 2024-12-18T06:18:33.7071090Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.7072504Z warnings.warn( 2024-12-18T06:18:33.7073862Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.7075279Z warnings.warn( 2024-12-18T06:18:33.7076030Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:773: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7076829Z warnings.warn( 2024-12-18T06:18:33.7077572Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:711: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7078375Z warnings.warn( 2024-12-18T06:18:33.7079118Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:773: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7079931Z warnings.warn( 2024-12-18T06:18:33.7080676Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:711: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7081482Z warnings.warn( 2024-12-18T06:18:33.7082234Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:827: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7083038Z warnings.warn( 2024-12-18T06:18:33.7083785Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:864: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7084586Z warnings.warn( 2024-12-18T06:18:33.7085385Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:827: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7086936Z warnings.warn( 2024-12-18T06:18:33.7088380Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:864: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7089946Z warnings.warn( 2024-12-18T06:18:33.7090372Z dist init r=1, world=2 2024-12-18T06:18:33.7090820Z dist init r=0, world=2 2024-12-18T06:18:33.7091259Z PASSED [45.6873s] [ 38%] 2024-12-18T06:18:33.7095443Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7099530Z warnings.warn( 2024-12-18T06:18:33.7101407Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:444: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2024-12-18T06:18:33.7103089Z warnings.warn( 2024-12-18T06:18:33.7105995Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.7107448Z warnings.warn( 2024-12-18T06:18:33.7109024Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7110649Z warnings.warn( 2024-12-18T06:18:33.7111466Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:444: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2024-12-18T06:18:33.7112336Z warnings.warn( 2024-12-18T06:18:33.7113696Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.7115114Z warnings.warn( 2024-12-18T06:18:33.7117121Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.7119199Z warnings.warn( 2024-12-18T06:18:33.7120292Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:773: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7121471Z warnings.warn( 2024-12-18T06:18:33.7122559Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:773: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7123721Z warnings.warn( 2024-12-18T06:18:33.7124808Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:711: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7125972Z warnings.warn( 2024-12-18T06:18:33.7127066Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:711: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7128236Z warnings.warn( 2024-12-18T06:18:33.7129321Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:827: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7130486Z warnings.warn( 2024-12-18T06:18:33.7131567Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:827: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7132932Z warnings.warn( 2024-12-18T06:18:33.7134020Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:864: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7135259Z warnings.warn( 2024-12-18T06:18:33.7136579Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:864: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7137754Z warnings.warn( 2024-12-18T06:18:33.7138468Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] Caught exception: 2024-12-18T06:18:33.7139737Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] Traceback (most recent call last): 2024-12-18T06:18:33.7141545Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 726, in run_test 2024-12-18T06:18:33.7143331Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] getattr(self, test_name)() 2024-12-18T06:18:33.7145116Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 599, in wrapper 2024-12-18T06:18:33.7146534Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] fn() 2024-12-18T06:18:33.7147645Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3108, in wrapper 2024-12-18T06:18:33.7148831Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] method(*args, **kwargs) 2024-12-18T06:18:33.7149996Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3108, in wrapper 2024-12-18T06:18:33.7151160Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] method(*args, **kwargs) 2024-12-18T06:18:33.7152391Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 460, in instantiated_test 2024-12-18T06:18:33.7153671Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] result = test(self, **param_kwargs) 2024-12-18T06:18:33.7154906Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 199, in wrapper 2024-12-18T06:18:33.7156583Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] return func(*args, **kwargs) 2024-12-18T06:18:33.7158403Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_core.py", line 255, in test_mixture_of_experts_with_delay_before_free 2024-12-18T06:18:33.7160165Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] self.run_subtests( 2024-12-18T06:18:33.7161883Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1186, in run_subtests 2024-12-18T06:18:33.7163944Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] return run_subtests(self, *args, **kwargs) 2024-12-18T06:18:33.7165975Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 993, in run_subtests 2024-12-18T06:18:33.7167875Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2024-12-18T06:18:33.7169782Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1489, in _test_fsdp_parity 2024-12-18T06:18:33.7171531Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] self.assertEqual( 2024-12-18T06:18:33.7173252Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 4016, in assertEqual 2024-12-18T06:18:33.7175139Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] raise error_metas.pop()[0].to_error( 2024-12-18T06:18:33.7176537Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] AssertionError: Tensor-likes are not close! 2024-12-18T06:18:33.7177742Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] 2024-12-18T06:18:33.7178900Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] Mismatched elements: 96 / 96 (100.0%) 2024-12-18T06:18:33.7180481Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] Greatest absolute difference: 0.10681381821632385 at index (10, 0) (up to 1e-05 allowed) 2024-12-18T06:18:33.7182295Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] Greatest relative difference: 15.62096881866455 at index (0, 5) (up to 1.3e-06 allowed) 2024-12-18T06:18:33.7183677Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] 2024-12-18T06:18:33.7184828Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] The failure occurred for item [0] 2024-12-18T06:18:33.7186077Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] FSDP did not match DDP 2024-12-18T06:18:33.7186812Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] 2024-12-18T06:18:33.7187678Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] To execute this test, run the following from the base repo dir: 2024-12-18T06:18:33.7189167Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] PYTORCH_TEST_WITH_ROCM=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2024-12-18T06:18:33.7190429Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] 2024-12-18T06:18:33.7191337Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2024-12-18T06:18:33.7192380Z [rank0]:E1218 06:14:47.598000 208608 site-packages/torch/testing/_internal/common_distributed.py:733] exiting process 0 with exit code: 10 2024-12-18T06:18:33.7192978Z dist init r=0, world=2 2024-12-18T06:18:33.7193651Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] Caught exception: 2024-12-18T06:18:33.7194494Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] Traceback (most recent call last): 2024-12-18T06:18:33.7196206Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 726, in run_test 2024-12-18T06:18:33.7197979Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] getattr(self, test_name)() 2024-12-18T06:18:33.7199759Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 599, in wrapper 2024-12-18T06:18:33.7201417Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] fn() 2024-12-18T06:18:33.7203038Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3108, in wrapper 2024-12-18T06:18:33.7204765Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] method(*args, **kwargs) 2024-12-18T06:18:33.7206482Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3108, in wrapper 2024-12-18T06:18:33.7208184Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] method(*args, **kwargs) 2024-12-18T06:18:33.7209978Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 460, in instantiated_test 2024-12-18T06:18:33.7211843Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] result = test(self, **param_kwargs) 2024-12-18T06:18:33.7213664Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 199, in wrapper 2024-12-18T06:18:33.7215530Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] return func(*args, **kwargs) 2024-12-18T06:18:33.7217350Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_core.py", line 255, in test_mixture_of_experts_with_delay_before_free 2024-12-18T06:18:33.7219112Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] self.run_subtests( 2024-12-18T06:18:33.7220834Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1186, in run_subtests 2024-12-18T06:18:33.7222658Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] return run_subtests(self, *args, **kwargs) 2024-12-18T06:18:33.7224517Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 993, in run_subtests 2024-12-18T06:18:33.7226239Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2024-12-18T06:18:33.7227417Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1489, in _test_fsdp_parity 2024-12-18T06:18:33.7228488Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] self.assertEqual( 2024-12-18T06:18:33.7229434Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 4016, in assertEqual 2024-12-18T06:18:33.7230415Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] raise error_metas.pop()[0].to_error( 2024-12-18T06:18:33.7231180Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] AssertionError: Tensor-likes are not close! 2024-12-18T06:18:33.7231835Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] 2024-12-18T06:18:33.7232464Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] Mismatched elements: 96 / 96 (100.0%) 2024-12-18T06:18:33.7233332Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] Greatest absolute difference: 0.10681381821632385 at index (10, 0) (up to 1e-05 allowed) 2024-12-18T06:18:33.7234322Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] Greatest relative difference: 15.62096881866455 at index (0, 5) (up to 1.3e-06 allowed) 2024-12-18T06:18:33.7235081Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] 2024-12-18T06:18:33.7235709Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] The failure occurred for item [0] 2024-12-18T06:18:33.7236397Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] FSDP did not match DDP 2024-12-18T06:18:33.7236983Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] 2024-12-18T06:18:33.7237682Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] To execute this test, run the following from the base repo dir: 2024-12-18T06:18:33.7238875Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] PYTORCH_TEST_WITH_ROCM=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2024-12-18T06:18:33.7239881Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] 2024-12-18T06:18:33.7240608Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2024-12-18T06:18:33.7241448Z [rank1]:E1218 06:14:47.612000 208609 site-packages/torch/testing/_internal/common_distributed.py:733] exiting process 1 with exit code: 10 2024-12-18T06:18:33.7241930Z dist init r=1, world=2 2024-12-18T06:18:33.7242744Z [rank0]:[W1218 06:14:47.524943353 ProcessGroupNCCL.cpp:1496] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2024-12-18T06:18:33.7244163Z [rank1]:[W1218 06:14:47.537568065 ProcessGroupNCCL.cpp:1496] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2024-12-18T06:18:33.7245001Z ('RERUN', {'yellow': True}) [50.4935s] [ 42%] 2024-12-18T06:18:33.7246932Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7248617Z warnings.warn( 2024-12-18T06:18:33.7249281Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:444: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2024-12-18T06:18:33.7249988Z warnings.warn( 2024-12-18T06:18:33.7251099Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.7252239Z warnings.warn( 2024-12-18T06:18:33.7253509Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7254847Z warnings.warn( 2024-12-18T06:18:33.7255502Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:444: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2024-12-18T06:18:33.7256200Z warnings.warn( 2024-12-18T06:18:33.7257306Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.7258440Z warnings.warn( 2024-12-18T06:18:33.7259528Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.7260660Z warnings.warn( 2024-12-18T06:18:33.7261270Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:773: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7261917Z warnings.warn( 2024-12-18T06:18:33.7262515Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:773: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7263161Z warnings.warn( 2024-12-18T06:18:33.7263758Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:711: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7264400Z warnings.warn( 2024-12-18T06:18:33.7265127Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:711: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7265771Z warnings.warn( 2024-12-18T06:18:33.7266499Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:827: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7267153Z warnings.warn( 2024-12-18T06:18:33.7267755Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:827: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7268397Z warnings.warn( 2024-12-18T06:18:33.7268989Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:864: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7269638Z warnings.warn( 2024-12-18T06:18:33.7270240Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:864: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7270881Z warnings.warn( 2024-12-18T06:18:33.7271062Z dist init r=0, world=2 2024-12-18T06:18:33.7271265Z dist init r=1, world=2 2024-12-18T06:18:33.7271454Z PASSED [48.1868s] [ 42%] 2024-12-18T06:18:33.7273063Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_none_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7274688Z warnings.warn( 2024-12-18T06:18:33.7275956Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7277257Z warnings.warn( 2024-12-18T06:18:33.7277435Z dist init r=1, world=2 2024-12-18T06:18:33.7277621Z dist init r=0, world=2 2024-12-18T06:18:33.7277811Z PASSED [13.0270s] [ 47%] 2024-12-18T06:18:33.7279439Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_shard_grad_op_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7281073Z warnings.warn( 2024-12-18T06:18:33.7282332Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7283745Z warnings.warn( 2024-12-18T06:18:33.7283921Z dist init r=1, world=2 2024-12-18T06:18:33.7284108Z dist init r=0, world=2 2024-12-18T06:18:33.7284298Z PASSED [13.6280s] [ 52%] 2024-12-18T06:18:33.7285427Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_no_shard_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:429: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2024-12-18T06:18:33.7286516Z return FSDP(layer, group, **fsdp_kwargs) 2024-12-18T06:18:33.7287852Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7289162Z warnings.warn( 2024-12-18T06:18:33.7289859Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1425: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2024-12-18T06:18:33.7290665Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2024-12-18T06:18:33.7291468Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:429: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2024-12-18T06:18:33.7292210Z return FSDP(layer, group, **fsdp_kwargs) 2024-12-18T06:18:33.7293536Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7294899Z warnings.warn( 2024-12-18T06:18:33.7295603Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1425: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2024-12-18T06:18:33.7296403Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2024-12-18T06:18:33.7297131Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:773: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7297775Z warnings.warn( 2024-12-18T06:18:33.7298371Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:711: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7299015Z warnings.warn( 2024-12-18T06:18:33.7299620Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:773: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7300257Z warnings.warn( 2024-12-18T06:18:33.7300856Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:711: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7301502Z warnings.warn( 2024-12-18T06:18:33.7302100Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:827: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7302741Z warnings.warn( 2024-12-18T06:18:33.7303471Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:864: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7304118Z warnings.warn( 2024-12-18T06:18:33.7304830Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:827: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7305477Z warnings.warn( 2024-12-18T06:18:33.7306075Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:864: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7306717Z warnings.warn( 2024-12-18T06:18:33.7306894Z dist init r=0, world=2 2024-12-18T06:18:33.7307081Z dist init r=1, world=2 2024-12-18T06:18:33.7307270Z PASSED [12.6268s] [ 57%] 2024-12-18T06:18:33.7308866Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_none_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7310483Z warnings.warn( 2024-12-18T06:18:33.7311741Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7313045Z warnings.warn( 2024-12-18T06:18:33.7313225Z dist init r=1, world=2 2024-12-18T06:18:33.7313409Z dist init r=0, world=2 2024-12-18T06:18:33.7313599Z PASSED [13.0293s] [ 61%] 2024-12-18T06:18:33.7315205Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_shard_grad_op_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7316825Z warnings.warn( 2024-12-18T06:18:33.7318089Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7319388Z warnings.warn( 2024-12-18T06:18:33.7319564Z dist init r=0, world=2 2024-12-18T06:18:33.7319751Z dist init r=1, world=2 2024-12-18T06:18:33.7319935Z PASSED [12.9293s] [ 66%] 2024-12-18T06:18:33.7320957Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_no_shard_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:429: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2024-12-18T06:18:33.7322131Z return FSDP(layer, group, **fsdp_kwargs) 2024-12-18T06:18:33.7323549Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7324851Z warnings.warn( 2024-12-18T06:18:33.7325538Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1425: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2024-12-18T06:18:33.7326338Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2024-12-18T06:18:33.7327133Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:429: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2024-12-18T06:18:33.7327885Z return FSDP(layer, group, **fsdp_kwargs) 2024-12-18T06:18:33.7329210Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7330509Z warnings.warn( 2024-12-18T06:18:33.7331194Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1425: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2024-12-18T06:18:33.7331998Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2024-12-18T06:18:33.7332718Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:773: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7333365Z warnings.warn( 2024-12-18T06:18:33.7333962Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:773: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7334653Z warnings.warn( 2024-12-18T06:18:33.7335247Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:711: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7335893Z warnings.warn( 2024-12-18T06:18:33.7336487Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:711: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7337129Z warnings.warn( 2024-12-18T06:18:33.7337727Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:827: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7338379Z warnings.warn( 2024-12-18T06:18:33.7338978Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:827: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7339616Z warnings.warn( 2024-12-18T06:18:33.7340213Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:864: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7340988Z warnings.warn( 2024-12-18T06:18:33.7341587Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:864: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2024-12-18T06:18:33.7342231Z warnings.warn( 2024-12-18T06:18:33.7342412Z dist init r=1, world=2 2024-12-18T06:18:33.7342733Z dist init r=0, world=2 2024-12-18T06:18:33.7342924Z PASSED [13.1285s] [ 71%] 2024-12-18T06:18:33.7344524Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_none_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7346136Z warnings.warn( 2024-12-18T06:18:33.7347410Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7348719Z warnings.warn( 2024-12-18T06:18:33.7348898Z dist init r=1, world=2 2024-12-18T06:18:33.7349083Z dist init r=0, world=2 2024-12-18T06:18:33.7349271Z PASSED [13.5277s] [ 76%] 2024-12-18T06:18:33.7350969Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7352684Z warnings.warn( 2024-12-18T06:18:33.7353945Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7355244Z warnings.warn( 2024-12-18T06:18:33.7355424Z dist init r=1, world=2 2024-12-18T06:18:33.7355611Z dist init r=0, world=2 2024-12-18T06:18:33.7355792Z PASSED [12.6279s] [ 80%] 2024-12-18T06:18:33.7356920Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_none_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:375: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2024-12-18T06:18:33.7358063Z warnings.warn( 2024-12-18T06:18:33.7358891Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:375: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2024-12-18T06:18:33.7359755Z warnings.warn( 2024-12-18T06:18:33.7361128Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7362535Z warnings.warn( 2024-12-18T06:18:33.7363793Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7365088Z warnings.warn( 2024-12-18T06:18:33.7365268Z dist init r=1, world=2 2024-12-18T06:18:33.7365451Z dist init r=0, world=2 2024-12-18T06:18:33.7365640Z PASSED [17.1333s] [ 85%] 2024-12-18T06:18:33.7366792Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_shard_grad_op_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:375: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2024-12-18T06:18:33.7367963Z warnings.warn( 2024-12-18T06:18:33.7368793Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:375: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2024-12-18T06:18:33.7369657Z warnings.warn( 2024-12-18T06:18:33.7370925Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7372230Z warnings.warn( 2024-12-18T06:18:33.7373504Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1045: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2024-12-18T06:18:33.7374867Z warnings.warn( 2024-12-18T06:18:33.7375051Z dist init r=1, world=2 2024-12-18T06:18:33.7375236Z dist init r=0, world=2 2024-12-18T06:18:33.7375424Z PASSED [17.1345s] [ 90%] 2024-12-18T06:18:33.7376558Z distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_False_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:375: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2024-12-18T06:18:33.7377717Z warnings.warn( 2024-12-18T06:18:33.7378822Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.7380089Z warnings.warn( 2024-12-18T06:18:33.7381036Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:375: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2024-12-18T06:18:33.7381906Z warnings.warn( 2024-12-18T06:18:33.7383003Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.7384145Z warnings.warn( 2024-12-18T06:18:33.7384323Z dist init r=1, world=2 2024-12-18T06:18:33.7384511Z dist init r=0, world=2 2024-12-18T06:18:33.7384712Z PASSED [12.1243s] [ 95%] 2024-12-18T06:18:33.7386080Z distributed/fsdp/test_fsdp_core.py::TestAutogradCUDA::test_unshard_params_as_tensors_cuda /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.7387465Z warnings.warn( 2024-12-18T06:18:33.7388569Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:845: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2024-12-18T06:18:33.7389706Z warnings.warn( 2024-12-18T06:18:33.7389885Z dist init r=1, world=2 2024-12-18T06:18:33.7390073Z dist init r=0, world=2 2024-12-18T06:18:33.7390259Z PASSED [12.0255s] [100%] 2024-12-18T06:18:33.7390377Z 2024-12-18T06:18:33.7390484Z ==================================== RERUNS ==================================== 2024-12-18T06:18:33.7390938Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda _ 2024-12-18T06:18:33.7391365Z Traceback (most recent call last): 2024-12-18T06:18:33.7391844Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 597, in wrapper 2024-12-18T06:18:33.7392323Z self._join_processes(fn) 2024-12-18T06:18:33.7392803Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 837, in _join_processes 2024-12-18T06:18:33.7393319Z self._check_return_codes(elapsed_time) 2024-12-18T06:18:33.7393846Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 886, in _check_return_codes 2024-12-18T06:18:33.7394363Z raise RuntimeError(error) 2024-12-18T06:18:33.7394651Z RuntimeError: Process 0 exited with error code 10 and exception: 2024-12-18T06:18:33.7394962Z Traceback (most recent call last): 2024-12-18T06:18:33.7395439Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 726, in run_test 2024-12-18T06:18:33.7395923Z getattr(self, test_name)() 2024-12-18T06:18:33.7396380Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 599, in wrapper 2024-12-18T06:18:33.7396845Z fn() 2024-12-18T06:18:33.7397241Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3108, in wrapper 2024-12-18T06:18:33.7397704Z method(*args, **kwargs) 2024-12-18T06:18:33.7398250Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3108, in wrapper 2024-12-18T06:18:33.7398704Z method(*args, **kwargs) 2024-12-18T06:18:33.7399183Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 460, in instantiated_test 2024-12-18T06:18:33.7399799Z result = test(self, **param_kwargs) 2024-12-18T06:18:33.7400286Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 199, in wrapper 2024-12-18T06:18:33.7400762Z return func(*args, **kwargs) 2024-12-18T06:18:33.7401250Z File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_core.py", line 255, in test_mixture_of_experts_with_delay_before_free 2024-12-18T06:18:33.7401743Z self.run_subtests( 2024-12-18T06:18:33.7402185Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1186, in run_subtests 2024-12-18T06:18:33.7402680Z return run_subtests(self, *args, **kwargs) 2024-12-18T06:18:33.7403182Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 993, in run_subtests 2024-12-18T06:18:33.7403712Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2024-12-18T06:18:33.7404247Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1489, in _test_fsdp_parity 2024-12-18T06:18:33.7404730Z self.assertEqual( 2024-12-18T06:18:33.7405167Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 4016, in assertEqual 2024-12-18T06:18:33.7405652Z raise error_metas.pop()[0].to_error( 2024-12-18T06:18:33.7405916Z AssertionError: Tensor-likes are not close! 2024-12-18T06:18:33.7406086Z 2024-12-18T06:18:33.7406179Z Mismatched elements: 96 / 96 (100.0%) 2024-12-18T06:18:33.7406533Z Greatest absolute difference: 0.10681381821632385 at index (10, 0) (up to 1e-05 allowed) 2024-12-18T06:18:33.7407018Z Greatest relative difference: 15.62096881866455 at index (0, 5) (up to 1.3e-06 allowed) 2024-12-18T06:18:33.7407292Z 2024-12-18T06:18:33.7407385Z The failure occurred for item [0] 2024-12-18T06:18:33.7407611Z FSDP did not match DDP 2024-12-18T06:18:33.7407729Z 2024-12-18T06:18:33.7407881Z To execute this test, run the following from the base repo dir: 2024-12-18T06:18:33.7408553Z PYTORCH_TEST_WITH_ROCM=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2024-12-18T06:18:33.7409085Z 2024-12-18T06:18:33.7409255Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2024-12-18T06:18:33.7409508Z 2024-12-18T06:18:33.7409619Z Process 1 exited with error code 10 and exception: 2024-12-18T06:18:33.7409894Z Traceback (most recent call last): 2024-12-18T06:18:33.7410378Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 726, in run_test 2024-12-18T06:18:33.7410866Z getattr(self, test_name)() 2024-12-18T06:18:33.7411325Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 599, in wrapper 2024-12-18T06:18:33.7411785Z fn() 2024-12-18T06:18:33.7412183Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3108, in wrapper 2024-12-18T06:18:33.7412641Z method(*args, **kwargs) 2024-12-18T06:18:33.7413071Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3108, in wrapper 2024-12-18T06:18:33.7413524Z method(*args, **kwargs) 2024-12-18T06:18:33.7414006Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 460, in instantiated_test 2024-12-18T06:18:33.7414562Z result = test(self, **param_kwargs) 2024-12-18T06:18:33.7415041Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 199, in wrapper 2024-12-18T06:18:33.7415669Z return func(*args, **kwargs) 2024-12-18T06:18:33.7416154Z File "/var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_core.py", line 255, in test_mixture_of_experts_with_delay_before_free 2024-12-18T06:18:33.7416762Z self.run_subtests( 2024-12-18T06:18:33.7417202Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1186, in run_subtests 2024-12-18T06:18:33.7417690Z return run_subtests(self, *args, **kwargs) 2024-12-18T06:18:33.7418200Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 993, in run_subtests 2024-12-18T06:18:33.7418724Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2024-12-18T06:18:33.7419245Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1489, in _test_fsdp_parity 2024-12-18T06:18:33.7419734Z self.assertEqual( 2024-12-18T06:18:33.7420166Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 4016, in assertEqual 2024-12-18T06:18:33.7420642Z raise error_metas.pop()[0].to_error( 2024-12-18T06:18:33.7420907Z AssertionError: Tensor-likes are not close! 2024-12-18T06:18:33.7421086Z 2024-12-18T06:18:33.7421170Z Mismatched elements: 96 / 96 (100.0%) 2024-12-18T06:18:33.7421526Z Greatest absolute difference: 0.10681381821632385 at index (10, 0) (up to 1e-05 allowed) 2024-12-18T06:18:33.7422004Z Greatest relative difference: 15.62096881866455 at index (0, 5) (up to 1.3e-06 allowed) 2024-12-18T06:18:33.7422283Z 2024-12-18T06:18:33.7422371Z The failure occurred for item [0] 2024-12-18T06:18:33.7422596Z FSDP did not match DDP 2024-12-18T06:18:33.7422718Z 2024-12-18T06:18:33.7422855Z To execute this test, run the following from the base repo dir: 2024-12-18T06:18:33.7423520Z PYTORCH_TEST_WITH_ROCM=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op_cuda 2024-12-18T06:18:33.7424055Z 2024-12-18T06:18:33.7424220Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2024-12-18T06:18:33.7424473Z 2024-12-18T06:18:33.7424476Z 2024-12-18T06:18:33.7424634Z ----------------------------- Captured stdout call ----------------------------- 2024-12-18T06:18:33.7425024Z Process 0 terminated with exit code 10, terminating remaining processes. 2024-12-18T06:18:33.7425742Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4661ad44f709fc51.xml - 2024-12-18T06:18:33.7426411Z =================== 21 passed, 1 rerun in 497.93s (0:08:17) ==================== 2024-12-18T06:18:33.7426611Z 2024-12-18T06:18:33.7426987Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_core 3/3 (test/test-reports/distributed.fsdp.test_fsdp_core_3.3_87a7fda532bc3014_.log) 2024-12-18T06:18:33.7427424Z 2024-12-18T06:18:33.7427688Z Running distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 1/1 ... [2024-12-18 06:18:33.678246] 2024-12-18T06:18:33.7428113Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T06:18:33.7428981Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_shard/sharded_tensor/test_sharded_tensor_reshard.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 06:18:33.678602] 2024-12-18T06:18:49.0253332Z 2024-12-18T06:18:49.0255572Z distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.test_sharded_tensor_reshard_1.1_0293eab441e08073_.log 2024-12-18T06:18:49.0258896Z Running 2 items in this shard: test/distributed/_shard/sharded_tensor/test_sharded_tensor_reshard.py::TestReshard::test_sharded_tensor_reshard, test/distributed/_shard/sharded_tensor/test_sharded_tensor_reshard.py::TestReshard::test_sharded_tensor_reshard_errors 2024-12-18T06:18:49.0261756Z 2024-12-18T06:18:49.0262174Z Running distributed/test_launcher 1/1 ... [2024-12-18 06:18:49.025611] 2024-12-18T06:18:49.0262961Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T06:18:49.0268354Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_launcher.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 06:18:49.026193] 2024-12-18T06:18:55.8586650Z 2024-12-18T06:18:55.8588591Z distributed/test_launcher 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_launcher_1.1_051b9da39e2d3828_.log 2024-12-18T06:18:55.8590603Z Running 1 items in this shard: test/distributed/test_launcher.py::TestDistributedLaunch::test_launch_user_script 2024-12-18T06:18:55.8591520Z 2024-12-18T06:18:55.8592812Z Running distributed/_shard/sharded_tensor/test_sharded_tensor 1/1 ... [2024-12-18 06:18:55.858888] 2024-12-18T06:18:55.8593998Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T06:18:55.8598133Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_shard/sharded_tensor/test_sharded_tensor.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 06:18:55.859253] 2024-12-18T06:24:16.6053261Z 2024-12-18T06:24:16.6055460Z distributed/_shard/sharded_tensor/test_sharded_tensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.test_sharded_tensor_1.1_e401c29f6e30b962_.log 2024-12-18T06:24:16.6117872Z Running 69 items in this shard: test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorMetadata::test_serialize_and_deserialize, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestCreateTensorFromParams::test_empty, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardParameter::test_shard_parameter, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardParameter::test_shard_parameter_errors, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardTensor::test_shard_tensor, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardTensor::test_shard_tensor_errors, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardTensor::test_shard_tensor_with_empty_shard, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestModuleHookApi::test_collect_local_shard, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestModuleHookApi::test_reshard_output, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestLocalTensor::test_local_tensor, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestLocalTensor::test_local_tensor_error, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_cleanup, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_complete_world_size, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_create_sharded_tensor_like, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_create_sharded_tensor_with_full, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_create_sharded_tensor_with_ones, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_create_sharded_tensor_with_rand, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_create_sharded_tensor_with_zeros, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_gather_even, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_gather_uneven, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_insufficient_sharding_dims, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_invalid_pg_rpc_ranks, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_invalid_sharding, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_load_state_dict_errors, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_multiple_local_shards, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_new_group, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_partial_world_size, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_sharded_tensor_metadata, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_sharded_tensor_sizes, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_sharding_columns, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_state_dict, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_state_dict_new_group, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorChunked::test_state_dict_no_sharded_tensors, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorEnumerable::test_create_sharded_tensor_with_ones, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorEnumerable::test_gather_even, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorEnumerable::test_gather_uneven, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorEnumerable::test_grid_sharding, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorEnumerable::test_multiple_local_shards, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorEnumerable::test_new_group, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorEnumerable::test_partial_world_size, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorEnumerable::test_sharded_tensor_device, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorEnumerable::test_sharded_tensor_metadata, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorEnumerable::test_sharded_tensor_to_cpu, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorEnumerable::test_sharded_tensor_to_cuda, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorEnumerable::test_sharded_tensor_to_test, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorEnumerable::test_uneven_shards, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorEnumerable::test_with_rpc_names, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorFromLocalTensor::test_init_from_local_tensor, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorFromLocalTensor::test_init_from_local_tensor_errors, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorFromLocalShards::test_init_from_local_shards, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorFromLocalShards::test_init_from_local_shards_and_global_metadata, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorFromLocalShards::test_init_from_local_shards_and_global_metadata_invalid_shards, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorFromLocalShards::test_init_from_local_shards_invalid_local_shards, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorFromLocalShards::test_init_from_local_shards_invalid_pin_memory, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorFromLocalShards::test_init_from_local_shards_invalid_property_cross_ranks, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorFromLocalShards::test_init_from_local_shards_invalid_shards_gaps, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorFromLocalShards::test_init_from_local_shards_invalid_shards_overlap, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorFromLocalShards::test_init_from_local_shards_new_group, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorFromLocalShards::test_local_shards, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorFromLocalShards::test_st_base_init_from_local_shards_and_global_metadata, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorCustomOps::test_custom_op, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorCustomOps::test_custom_op_errors, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorCustomOps::test_custom_op_override, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardMetadata::test_create_shard_with_no_placement, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardMetadata::test_shard_metadata_init, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorSubGroupInit::test_sub_process_group_placement_validation, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestShardedTensorSubGroupInit::test_sub_process_group_sharded_tensor_init, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestCreateTensorNoProcessGroupMode::test_init_from_local_shards_and_global_metadata, test/distributed/_shard/sharded_tensor/test_sharded_tensor.py::TestCreateTensorNoProcessGroupMode::test_non_contiguous_local_shards 2024-12-18T06:24:16.6163591Z 2024-12-18T06:24:16.6163865Z Running distributed/fsdp/test_fsdp_mixed_precision 2/2 ... [2024-12-18 06:24:16.605729] 2024-12-18T06:24:16.6164338Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T06:24:16.6165364Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_mixed_precision.py', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 06:24:16.606308] 2024-12-18T06:28:53.9101387Z 2024-12-18T06:28:53.9103356Z distributed/fsdp/test_fsdp_mixed_precision 2/2 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_mixed_precision_2.2_ecef839327f1fc34_.log 2024-12-18T06:28:53.9118596Z Running 26 items in this shard: test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_eval_root_cast_inputs, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_full_precision_in_eval_comm, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_resnet, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_batchnorm_convert_sync_bn_False, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_embedding_default, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_embedding_only_params_and_bufs, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_embedding_params_and_reduce_diff, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_embedding_reduce, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_submodules_with_different_precisions, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_submodules_with_different_precisions_error 2024-12-18T06:28:53.9150049Z 2024-12-18T06:28:53.9150280Z Running distributed/test_c10d_spawn_gloo 1/1 ... [2024-12-18 06:28:53.910430] 2024-12-18T06:28:53.9150720Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T06:28:53.9151926Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_c10d_spawn_gloo.py', '--shard-id=1', '--num-shards=1', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 06:28:53.910788] 2024-12-18T06:30:59.9489891Z 2024-12-18T06:30:59.9491593Z distributed/test_c10d_spawn_gloo 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_spawn_gloo_1.1_2f93811f8b276fb0_.log 2024-12-18T06:30:59.9502859Z Running 11 items in this shard: test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_cpu, test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_cuda, test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_rnn, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_gather, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_to_all, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_to_all_single, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_allreduce, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_broadcast, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_gather, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_reduce, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_scatter 2024-12-18T06:30:59.9512759Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_cpu 2024-12-18T06:30:59.9514542Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_cuda 2024-12-18T06:30:59.9515444Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_rnn 2024-12-18T06:30:59.9516288Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_gather 2024-12-18T06:30:59.9517125Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_to_all 2024-12-18T06:30:59.9517965Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_to_all_single 2024-12-18T06:30:59.9518799Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_allreduce 2024-12-18T06:30:59.9519597Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_broadcast 2024-12-18T06:30:59.9520536Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_gather 2024-12-18T06:30:59.9522329Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_reduce 2024-12-18T06:30:59.9523945Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_scatter 2024-12-18T06:30:59.9524799Z 2024-12-18T06:30:59.9525216Z Running distributed/test_c10d_spawn_ucc 1/1 ... [2024-12-18 06:30:59.949041] 2024-12-18T06:30:59.9526035Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T06:30:59.9527988Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_c10d_spawn_ucc.py', '--shard-id=1', '--num-shards=1', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 06:30:59.949361] 2024-12-18T06:31:36.1192073Z 2024-12-18T06:31:36.1193541Z distributed/test_c10d_spawn_ucc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_spawn_ucc_1.1_0f3d2946298ad24c_.log 2024-12-18T06:31:36.1199497Z Running 6 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_gather, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_to_all, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_to_all_single, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_allreduce, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_broadcast, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_reduce 2024-12-18T06:31:36.1204718Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_gather 2024-12-18T06:31:36.1206566Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_to_all 2024-12-18T06:31:36.1208180Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_to_all_single 2024-12-18T06:31:36.1209035Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_allreduce 2024-12-18T06:31:36.1210412Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_broadcast 2024-12-18T06:31:36.1211228Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_reduce 2024-12-18T06:31:36.1211662Z 2024-12-18T06:31:36.1212105Z Running distributed/test_c10d_spawn_nccl 1/1 ... [2024-12-18 06:31:36.119675] 2024-12-18T06:31:36.1212544Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T06:31:36.1213573Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_c10d_spawn_nccl.py', '--shard-id=1', '--num-shards=1', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 06:31:36.120281] 2024-12-18T06:33:25.2964218Z 2024-12-18T06:33:25.2966012Z distributed/test_c10d_spawn_nccl 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_spawn_nccl_1.1_237df9282dc809bd_.log 2024-12-18T06:33:25.2973531Z Running 9 items in this shard: test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_all_gather, test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_all_gather_base, test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_all_to_all, test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_all_to_all_single, test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_allreduce, test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_broadcast, test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_reduce, test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_reduce_scatter, test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_reduce_scatter_non_contiguous 2024-12-18T06:33:25.2977286Z Running 1 items in this shard: test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_all_gather 2024-12-18T06:33:25.2978129Z Running 1 items in this shard: test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_all_gather_base 2024-12-18T06:33:25.2978958Z Running 1 items in this shard: test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_all_to_all 2024-12-18T06:33:25.2979812Z Running 1 items in this shard: test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_all_to_all_single 2024-12-18T06:33:25.2980648Z Running 1 items in this shard: test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_allreduce 2024-12-18T06:33:25.2981447Z Running 1 items in this shard: test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_broadcast 2024-12-18T06:33:25.2982251Z Running 1 items in this shard: test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_reduce 2024-12-18T06:33:25.2983070Z Running 1 items in this shard: test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_reduce_scatter 2024-12-18T06:33:25.2983983Z Running 1 items in this shard: test/distributed/test_c10d_spawn_nccl.py::TestDistributedNNFunctionsNccl::test_reduce_scatter_non_contiguous 2024-12-18T06:33:25.2984514Z 2024-12-18T06:33:25.2984775Z Running distributed/elastic/events/lib_test 1/1 ... [2024-12-18 06:33:25.296958] 2024-12-18T06:33:25.2985220Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T06:33:25.2986206Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/elastic/events/lib_test.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 06:33:25.297573] 2024-12-18T06:33:30.5239550Z 2024-12-18T06:33:30.5241289Z distributed/elastic/events/lib_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.events.lib_test_1.1_e25462eed2d38423_.log 2024-12-18T06:33:30.5246924Z Running 8 items in this shard: test/distributed/elastic/events/lib_test.py::EventLibTest::test_event_created, test/distributed/elastic/events/lib_test.py::EventLibTest::test_event_deser, test/distributed/elastic/events/lib_test.py::EventLibTest::test_get_or_create_logger, test/distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_construct_and_record_rdzv_event, test/distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_construct_and_record_rdzv_event_does_not_run_if_invalid_dest, test/distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_rdzv_event_created, test/distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_rdzv_event_deserialize, test/distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_rdzv_event_str 2024-12-18T06:33:30.5249705Z 2024-12-18T06:33:30.5249966Z Running distributed/elastic/metrics/api_test 1/1 ... [2024-12-18 06:33:30.523933] 2024-12-18T06:33:30.5250424Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T06:33:30.5251430Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/elastic/metrics/api_test.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 06:33:30.524282] 2024-12-18T06:33:35.6505409Z 2024-12-18T06:33:35.6507313Z distributed/elastic/metrics/api_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.metrics.api_test_1.1_5560c934f5defb02_.log 2024-12-18T06:33:35.6510619Z Running 3 items in this shard: test/distributed/elastic/metrics/api_test.py::MetricsApiTest::test_get_metric_name, test/distributed/elastic/metrics/api_test.py::MetricsApiTest::test_inheritance, test/distributed/elastic/metrics/api_test.py::MetricsApiTest::test_profile 2024-12-18T06:33:35.6512596Z 2024-12-18T06:33:35.6513387Z Running distributed/elastic/multiprocessing/api_test 1/1 ... [2024-12-18 06:33:35.650906] 2024-12-18T06:33:35.6514387Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T06:33:35.6521509Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/elastic/multiprocessing/api_test.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 06:33:35.651519] 2024-12-18T06:34:12.6980753Z 2024-12-18T06:34:12.6982823Z distributed/elastic/multiprocessing/api_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.multiprocessing.api_test_1.1_7ace8168d7eeb750_.log 2024-12-18T06:34:12.7005821Z Running 25 items in this shard: test/distributed/elastic/multiprocessing/api_test.py::RunProcResultsTest::test_get_failures, test/distributed/elastic/multiprocessing/api_test.py::RunProcResultsTest::test_is_failed, test/distributed/elastic/multiprocessing/api_test.py::StdTest::test_from_str_bad_input, test/distributed/elastic/multiprocessing/api_test.py::StdTest::test_from_value, test/distributed/elastic/multiprocessing/api_test.py::StdTest::test_from_value_map, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_args_env_len_mismatch, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_function_large_ret_val, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_function_raise, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_function_with_tensor, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_invalid_log_dir, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_multiprocess_context_close, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_multiprocessing_context_poll_raises_exception, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_pcontext_wait, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_pcontext_wait_on_a_child_thread, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_to_map, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_void_function, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_wait_for_all_child_procs_to_exit, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsBinaryTest::test_binary_exit, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsBinaryTest::test_binary_incorrect_entrypoint, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsBinaryTest::test_binary_raises, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsBinaryTest::test_subprocess_context_close, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsBinaryTest::test_validate_full_rank, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesListAsFuncTest::test_function, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesListAsBinaryTest::test_binary, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesListAsBinaryTest::test_binary_redirect_and_tee 2024-12-18T06:34:12.7027033Z 2024-12-18T06:34:12.7027599Z Running distributed/elastic/timer/local_timer_example 1/1 ... [2024-12-18 06:34:12.698498] 2024-12-18T06:34:12.7028548Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T06:34:12.7030545Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/elastic/timer/local_timer_example.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 06:34:12.699197] 2024-12-18T06:34:34.6605732Z 2024-12-18T06:34:34.6607499Z distributed/elastic/timer/local_timer_example 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.timer.local_timer_example_1.1_fb5832c6c9217f6d_.log 2024-12-18T06:34:34.6610696Z Running 2 items in this shard: test/distributed/elastic/timer/local_timer_example.py::LocalTimerExample::test_example_start_method_spawn, test/distributed/elastic/timer/local_timer_example.py::LocalTimerExample::test_torch_mp_example 2024-12-18T06:34:34.6612498Z 2024-12-18T06:34:34.6614194Z Running distributed/elastic/utils/logging_test 1/1 ... [2024-12-18 06:34:34.660957] 2024-12-18T06:34:34.6615554Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T06:34:34.6621672Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/elastic/utils/logging_test.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 06:34:34.661636] 2024-12-18T06:34:39.9887437Z 2024-12-18T06:34:39.9894708Z distributed/elastic/utils/logging_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.utils.logging_test_1.1_a6520880d4cb0a09_.log 2024-12-18T06:34:39.9897558Z Running 2 items in this shard: test/distributed/elastic/utils/logging_test.py::LoggingTest::test_derive_module_name, test/distributed/elastic/utils/logging_test.py::LoggingTest::test_logger_name 2024-12-18T06:34:39.9899037Z 2024-12-18T06:34:39.9899558Z Running distributed/elastic/utils/util_test 1/1 ... [2024-12-18 06:34:39.988963] 2024-12-18T06:34:39.9900455Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2024-12-18T06:34:39.9902390Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/elastic/utils/util_test.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-12-18 06:34:39.989265] 2024-12-18T06:34:45.2161863Z 2024-12-18T06:34:45.2163853Z distributed/elastic/utils/util_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.utils.util_test_1.1_c7df02b1459e573a_.log 2024-12-18T06:34:45.2174964Z Running 12 items in this shard: test/distributed/elastic/utils/util_test.py::StoreUtilTest::test_barrier, test/distributed/elastic/utils/util_test.py::StoreUtilTest::test_barrier_hash_store, test/distributed/elastic/utils/util_test.py::StoreUtilTest::test_barrier_timeout_operations, test/distributed/elastic/utils/util_test.py::StoreUtilTest::test_barrier_timeout_rank_tracing, test/distributed/elastic/utils/util_test.py::StoreUtilTest::test_get_all_rank_0, test/distributed/elastic/utils/util_test.py::StoreUtilTest::test_get_all_rank_n, test/distributed/elastic/utils/util_test.py::StoreUtilTest::test_synchronize, test/distributed/elastic/utils/util_test.py::StoreUtilTest::test_synchronize_hash_store, test/distributed/elastic/utils/util_test.py::UtilTest::test_get_logger, test/distributed/elastic/utils/util_test.py::UtilTest::test_get_logger_custom_name, test/distributed/elastic/utils/util_test.py::UtilTest::test_get_logger_different, test/distributed/elastic/utils/util_test.py::UtilTest::test_get_logger_none 2024-12-18T06:34:45.2182151Z 2024-12-18T06:34:46.2340967Z Running test batch 'tests to run' cost 10419.2 seconds 2024-12-18T06:34:47.2940513Z 2024-12-18T06:34:47.2941209Z real 173m50.820s 2024-12-18T06:34:47.2941728Z user 294m38.182s 2024-12-18T06:34:47.2942164Z sys 232m59.714s 2024-12-18T06:34:47.2944687Z + assert_git_not_dirty 2024-12-18T06:34:47.2945301Z + [[ linux-focal-rocm6.2-py3.10 != *rocm* ]] 2024-12-18T06:34:47.2945996Z + [[ linux-focal-rocm6.2-py3.10 == *cuda* ]] 2024-12-18T06:34:47.2946675Z + [[ linux-focal-rocm6.2-py3.10 == *rocm* ]] 2024-12-18T06:34:47.2947256Z + [[ 2 == 1 ]] 2024-12-18T06:34:47.2947648Z + [[ 2 == 1 ]] 2024-12-18T06:34:47.3060004Z ##[group]Run # copy test results back to the mounted workspace, needed sudo, resulting permissions were correct 2024-12-18T06:34:47.3060808Z # copy test results back to the mounted workspace, needed sudo, resulting permissions were correct 2024-12-18T06:34:47.3061739Z docker exec -t "a195586eb8a191d75ed1195cd70100d037bba4a1d97216b76c17cc218bc57f83" sh -c "cd ../pytorch && sudo cp -R test/test-reports ../workspace/test" 2024-12-18T06:34:47.3100642Z shell: /usr/bin/bash -e {0} 2024-12-18T06:34:47.3101057Z env: 2024-12-18T06:34:47.3101375Z GIT_DEFAULT_BRANCH: main 2024-12-18T06:34:47.3101822Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T06:34:47.3102625Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T06:34:47.3103378Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T06:34:47.3103798Z AWS_REGION: us-east-1 2024-12-18T06:34:47.3104260Z AWS_ACCESS_KEY_ID: *** 2024-12-18T06:34:47.3104859Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T06:34:47.3112686Z AWS_SESSION_TOKEN: *** 2024-12-18T06:34:47.3113089Z CONTAINER_NAME: a195586eb8a191d75ed1195cd70100d037bba4a1d97216b76c17cc218bc57f83 2024-12-18T06:34:47.3113530Z ##[endgroup] 2024-12-18T06:34:47.3633510Z sudo: setrlimit(RLIMIT_STACK): Operation not permitted 2024-12-18T06:34:47.4128584Z ##[group]Run cat test/**/*_toprint.log || true 2024-12-18T06:34:47.4129384Z cat test/**/*_toprint.log || true 2024-12-18T06:34:47.4189283Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T06:34:47.4190010Z env: 2024-12-18T06:34:47.4190447Z GIT_DEFAULT_BRANCH: main 2024-12-18T06:34:47.4191056Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T06:34:47.4192083Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T06:34:47.4193059Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T06:34:47.4193601Z AWS_REGION: us-east-1 2024-12-18T06:34:47.4194269Z AWS_ACCESS_KEY_ID: *** 2024-12-18T06:34:47.4195107Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T06:34:47.4205677Z AWS_SESSION_TOKEN: *** 2024-12-18T06:34:47.4206479Z CONTAINER_NAME: a195586eb8a191d75ed1195cd70100d037bba4a1d97216b76c17cc218bc57f83 2024-12-18T06:34:47.4207351Z ##[endgroup] 2024-12-18T06:34:47.4373403Z cat: 'test/**/*_toprint.log': No such file or directory 2024-12-18T06:34:47.4608991Z Prepare all required actions 2024-12-18T06:34:47.4609824Z Getting action download info 2024-12-18T06:34:47.9049979Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2024-12-18T06:34:48.5555243Z ##[group]Run ./.github/actions/upload-test-artifacts 2024-12-18T06:34:48.5555527Z with: 2024-12-18T06:34:48.5555702Z use-gha: true 2024-12-18T06:34:48.5555969Z file-suffix: test-distributed-2-3-linux.rocm.gpu_34566687110 2024-12-18T06:34:48.5556273Z s3-bucket: gha-artifacts 2024-12-18T06:34:48.5556476Z env: 2024-12-18T06:34:48.5556666Z GIT_DEFAULT_BRANCH: main 2024-12-18T06:34:48.5556947Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T06:34:48.5557441Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T06:34:48.5557902Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T06:34:48.5558167Z AWS_REGION: us-east-1 2024-12-18T06:34:48.5558530Z AWS_ACCESS_KEY_ID: *** 2024-12-18T06:34:48.5558851Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T06:34:48.5563039Z AWS_SESSION_TOKEN: *** 2024-12-18T06:34:48.5563372Z CONTAINER_NAME: a195586eb8a191d75ed1195cd70100d037bba4a1d97216b76c17cc218bc57f83 2024-12-18T06:34:48.5563727Z ##[endgroup] 2024-12-18T06:34:48.5635476Z ##[group]Run actions/upload-artifact@v4 2024-12-18T06:34:48.5635817Z with: 2024-12-18T06:34:48.5636290Z name: test-jsons-runattempt1-test-distributed-2-3-linux.rocm.gpu_34566687110.zip 2024-12-18T06:34:48.5636748Z retention-days: 14 2024-12-18T06:34:48.5637100Z if-no-files-found: warn 2024-12-18T06:34:48.5637417Z path: test/**/*.json 2024-12-18T06:34:48.5637707Z compression-level: 6 2024-12-18T06:34:48.5638015Z overwrite: false 2024-12-18T06:34:48.5638299Z include-hidden-files: false 2024-12-18T06:34:48.5638626Z env: 2024-12-18T06:34:48.5638883Z GIT_DEFAULT_BRANCH: main 2024-12-18T06:34:48.5639194Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T06:34:48.5639762Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T06:34:48.5640242Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T06:34:48.5640566Z AWS_REGION: us-east-1 2024-12-18T06:34:48.5640916Z AWS_ACCESS_KEY_ID: *** 2024-12-18T06:34:48.5641296Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T06:34:48.5645622Z AWS_SESSION_TOKEN: *** 2024-12-18T06:34:48.5646044Z CONTAINER_NAME: a195586eb8a191d75ed1195cd70100d037bba4a1d97216b76c17cc218bc57f83 2024-12-18T06:34:48.5646484Z ##[endgroup] 2024-12-18T06:34:49.1508786Z With the provided path, there will be 6 files uploaded 2024-12-18T06:34:49.1514506Z Artifact name is valid! 2024-12-18T06:34:49.1515322Z Root directory input is valid! 2024-12-18T06:34:50.5626895Z Beginning upload of artifact content to blob storage 2024-12-18T06:34:51.0804261Z Uploaded bytes 44674 2024-12-18T06:34:51.1699688Z Finished uploading artifact content to blob storage! 2024-12-18T06:34:51.1705417Z SHA256 hash of uploaded artifact zip is 72cb842c5feaa3fbcc0f8518e5a912c9f037f99dca880156d78e8ed91019d9df 2024-12-18T06:34:51.1707784Z Finalizing artifact upload 2024-12-18T06:34:51.3187887Z Artifact test-jsons-runattempt1-test-distributed-2-3-linux.rocm.gpu_34566687110.zip.zip successfully finalized. Artifact ID 2336005662 2024-12-18T06:34:51.3190165Z Artifact test-jsons-runattempt1-test-distributed-2-3-linux.rocm.gpu_34566687110.zip has been successfully uploaded! Final size is 44674 bytes. Artifact ID is 2336005662 2024-12-18T06:34:51.3201378Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/12383255654/artifacts/2336005662 2024-12-18T06:34:51.3522571Z ##[group]Run actions/upload-artifact@v4 2024-12-18T06:34:51.3523164Z with: 2024-12-18T06:34:51.3523955Z name: test-reports-runattempt1-test-distributed-2-3-linux.rocm.gpu_34566687110.zip 2024-12-18T06:34:51.3524887Z retention-days: 14 2024-12-18T06:34:51.3525367Z if-no-files-found: ignore 2024-12-18T06:34:51.3525903Z path: test/**/*.xml test/**/*.csv 2024-12-18T06:34:51.3527328Z compression-level: 6 2024-12-18T06:34:51.3527824Z overwrite: false 2024-12-18T06:34:51.3528294Z include-hidden-files: false 2024-12-18T06:34:51.3528798Z env: 2024-12-18T06:34:51.3529194Z GIT_DEFAULT_BRANCH: main 2024-12-18T06:34:51.3529750Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T06:34:51.3530750Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T06:34:51.3531700Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T06:34:51.3532237Z AWS_REGION: us-east-1 2024-12-18T06:34:51.3532832Z AWS_ACCESS_KEY_ID: *** 2024-12-18T06:34:51.3533545Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T06:34:51.3543943Z AWS_SESSION_TOKEN: *** 2024-12-18T06:34:51.3544739Z CONTAINER_NAME: a195586eb8a191d75ed1195cd70100d037bba4a1d97216b76c17cc218bc57f83 2024-12-18T06:34:51.3545580Z ##[endgroup] 2024-12-18T06:34:52.0093138Z With the provided path, there will be 555 files uploaded 2024-12-18T06:34:52.0098044Z Artifact name is valid! 2024-12-18T06:34:52.0099114Z Root directory input is valid! 2024-12-18T06:34:53.6423458Z Beginning upload of artifact content to blob storage 2024-12-18T06:34:54.5617251Z Uploaded bytes 387922 2024-12-18T06:34:54.6500648Z Finished uploading artifact content to blob storage! 2024-12-18T06:34:54.6506445Z SHA256 hash of uploaded artifact zip is 3176f9a6bc6877ac73c8f67b5064cde035f391ee0d7d0be3374aabd4dc6f20ad 2024-12-18T06:34:54.6509151Z Finalizing artifact upload 2024-12-18T06:34:54.8038376Z Artifact test-reports-runattempt1-test-distributed-2-3-linux.rocm.gpu_34566687110.zip.zip successfully finalized. Artifact ID 2336005827 2024-12-18T06:34:54.8040732Z Artifact test-reports-runattempt1-test-distributed-2-3-linux.rocm.gpu_34566687110.zip has been successfully uploaded! Final size is 387922 bytes. Artifact ID is 2336005827 2024-12-18T06:34:54.8051728Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/12383255654/artifacts/2336005827 2024-12-18T06:34:54.8404208Z ##[group]Run actions/upload-artifact@v4 2024-12-18T06:34:54.8404810Z with: 2024-12-18T06:34:54.8405505Z name: logs-runattempt1-test-distributed-2-3-linux.rocm.gpu_34566687110.zip 2024-12-18T06:34:54.8406345Z retention-days: 14 2024-12-18T06:34:54.8406814Z if-no-files-found: ignore 2024-12-18T06:34:54.8407342Z path: usage_log.txt test/**/*.log 2024-12-18T06:34:54.8407897Z compression-level: 6 2024-12-18T06:34:54.8408352Z overwrite: false 2024-12-18T06:34:54.8408802Z include-hidden-files: false 2024-12-18T06:34:54.8409298Z env: 2024-12-18T06:34:54.8409687Z GIT_DEFAULT_BRANCH: main 2024-12-18T06:34:54.8410246Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T06:34:54.8411258Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T06:34:54.8412198Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T06:34:54.8412711Z AWS_REGION: us-east-1 2024-12-18T06:34:54.8413286Z AWS_ACCESS_KEY_ID: *** 2024-12-18T06:34:54.8413973Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T06:34:54.8424551Z AWS_SESSION_TOKEN: *** 2024-12-18T06:34:54.8425340Z CONTAINER_NAME: a195586eb8a191d75ed1195cd70100d037bba4a1d97216b76c17cc218bc57f83 2024-12-18T06:34:54.8426170Z ##[endgroup] 2024-12-18T06:34:55.4683151Z Multiple search paths detected. Calculating the least common ancestor of all paths 2024-12-18T06:34:55.4685289Z The least common ancestor is /home/pytorchci/actions-runner/_work/pytorch/pytorch. This will be the root directory of the artifact 2024-12-18T06:34:55.4686590Z With the provided path, there will be 91 files uploaded 2024-12-18T06:34:55.4690134Z Artifact name is valid! 2024-12-18T06:34:55.4690752Z Root directory input is valid! 2024-12-18T06:34:57.0223415Z Beginning upload of artifact content to blob storage 2024-12-18T06:34:57.8493688Z Uploaded bytes 352860 2024-12-18T06:34:57.9384041Z Finished uploading artifact content to blob storage! 2024-12-18T06:34:57.9389870Z SHA256 hash of uploaded artifact zip is 593e62f85efc53b0d8d9cb33cb264fcdcf4262c1719b6a95a8917ce18db661fd 2024-12-18T06:34:57.9392886Z Finalizing artifact upload 2024-12-18T06:34:58.0816080Z Artifact logs-runattempt1-test-distributed-2-3-linux.rocm.gpu_34566687110.zip.zip successfully finalized. Artifact ID 2336005976 2024-12-18T06:34:58.0818267Z Artifact logs-runattempt1-test-distributed-2-3-linux.rocm.gpu_34566687110.zip has been successfully uploaded! Final size is 352860 bytes. Artifact ID is 2336005976 2024-12-18T06:34:58.0829662Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/12383255654/artifacts/2336005976 2024-12-18T06:34:58.1138547Z ##[group]Run # shellcheck disable=SC2156 2024-12-18T06:34:58.1139292Z # shellcheck disable=SC2156 2024-12-18T06:34:58.1140312Z find . -iname "core.[1-9]*" -exec docker exec "${CONTAINER_NAME}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \; 2024-12-18T06:34:58.1197890Z shell: /usr/bin/bash -e {0} 2024-12-18T06:34:58.1198414Z env: 2024-12-18T06:34:58.1198820Z GIT_DEFAULT_BRANCH: main 2024-12-18T06:34:58.1199390Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T06:34:58.1200425Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T06:34:58.1201364Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T06:34:58.1201881Z AWS_REGION: us-east-1 2024-12-18T06:34:58.1202460Z AWS_ACCESS_KEY_ID: *** 2024-12-18T06:34:58.1203154Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T06:34:58.1213437Z AWS_SESSION_TOKEN: *** 2024-12-18T06:34:58.1214265Z CONTAINER_NAME: a195586eb8a191d75ed1195cd70100d037bba4a1d97216b76c17cc218bc57f83 2024-12-18T06:34:58.1215230Z ##[endgroup] 2024-12-18T06:34:58.4069080Z Prepare all required actions 2024-12-18T06:34:58.4069823Z Getting action download info 2024-12-18T06:34:58.4121014Z ##[group]Run ./.github/actions/teardown-rocm 2024-12-18T06:34:58.4121590Z env: 2024-12-18T06:34:58.4121986Z GIT_DEFAULT_BRANCH: main 2024-12-18T06:34:58.4122535Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T06:34:58.4123515Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T06:34:58.4124466Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T06:34:58.4124979Z AWS_REGION: us-east-1 2024-12-18T06:34:58.4125530Z AWS_ACCESS_KEY_ID: *** 2024-12-18T06:34:58.4126221Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T06:34:58.4136560Z AWS_SESSION_TOKEN: *** 2024-12-18T06:34:58.4137332Z CONTAINER_NAME: a195586eb8a191d75ed1195cd70100d037bba4a1d97216b76c17cc218bc57f83 2024-12-18T06:34:58.4138164Z ##[endgroup] 2024-12-18T06:34:58.4160526Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2024-12-18T06:34:58.4161037Z # ignore expansion of "docker ps -q" since it could be empty 2024-12-18T06:34:58.4161429Z # shellcheck disable=SC2046 2024-12-18T06:34:58.4161758Z docker stop $(docker ps -q) || true 2024-12-18T06:34:58.4162089Z # Prune all stopped containers. 2024-12-18T06:34:58.4162406Z docker container prune -f 2024-12-18T06:34:58.4207098Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T06:34:58.4207793Z env: 2024-12-18T06:34:58.4208203Z GIT_DEFAULT_BRANCH: main 2024-12-18T06:34:58.4208765Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T06:34:58.4209762Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T06:34:58.4210709Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T06:34:58.4211232Z AWS_REGION: us-east-1 2024-12-18T06:34:58.4211796Z AWS_ACCESS_KEY_ID: *** 2024-12-18T06:34:58.4212483Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T06:34:58.4223463Z AWS_SESSION_TOKEN: *** 2024-12-18T06:34:58.4224375Z CONTAINER_NAME: a195586eb8a191d75ed1195cd70100d037bba4a1d97216b76c17cc218bc57f83 2024-12-18T06:34:58.4225373Z ##[endgroup] 2024-12-18T06:34:59.0876250Z a195586eb8a1 2024-12-18T06:35:18.4920160Z Deleted Containers: 2024-12-18T06:35:18.4921011Z a195586eb8a191d75ed1195cd70100d037bba4a1d97216b76c17cc218bc57f83 2024-12-18T06:35:18.4921623Z 2024-12-18T06:35:18.4921836Z Total reclaimed space: 7.18GB 2024-12-18T06:35:18.4996967Z Prepare all required actions 2024-12-18T06:35:18.5046293Z ##[group]Run ./.github/actions/diskspace-cleanup 2024-12-18T06:35:18.5046933Z with: 2024-12-18T06:35:18.5047353Z diskspace-cutoff: 70 2024-12-18T06:35:18.5047811Z env: 2024-12-18T06:35:18.5048214Z GIT_DEFAULT_BRANCH: main 2024-12-18T06:35:18.5048812Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T06:35:18.5049825Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T06:35:18.5050814Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T06:35:18.5051337Z AWS_REGION: us-east-1 2024-12-18T06:35:18.5051907Z AWS_ACCESS_KEY_ID: *** 2024-12-18T06:35:18.5052625Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T06:35:18.5064034Z AWS_SESSION_TOKEN: *** 2024-12-18T06:35:18.5064820Z CONTAINER_NAME: a195586eb8a191d75ed1195cd70100d037bba4a1d97216b76c17cc218bc57f83 2024-12-18T06:35:18.5065665Z ##[endgroup] 2024-12-18T06:35:18.5090154Z ##[group]Run set -ex 2024-12-18T06:35:18.5090739Z set -ex 2024-12-18T06:35:18.5091199Z diskspace_cutoff=70 2024-12-18T06:35:18.5091884Z docker_root_dir=$(docker info -f '{{.DockerRootDir}}') 2024-12-18T06:35:18.5092974Z diskspace=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2024-12-18T06:35:18.5094726Z msg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified" 2024-12-18T06:35:18.5096532Z if [[ "$diskspace" -ge "$diskspace_cutoff" ]] ; then 2024-12-18T06:35:18.5097268Z  docker system prune -af 2024-12-18T06:35:18.5098188Z  diskspace_new=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2024-12-18T06:35:18.5099209Z  if [[ "$diskspace_new" -gt "$diskspace_cutoff" ]] ; then 2024-12-18T06:35:18.5100288Z  echo "Error: Available diskspace is less than $diskspace_cutoff percent. Not enough diskspace." 2024-12-18T06:35:18.5101242Z  echo "$msg" 2024-12-18T06:35:18.5101730Z  exit 1 2024-12-18T06:35:18.5102178Z  else 2024-12-18T06:35:18.5102699Z  difference=$((diskspace - diskspace_new)) 2024-12-18T06:35:18.5103429Z  echo "Diskspace saved: $difference percent" 2024-12-18T06:35:18.5104038Z  fi 2024-12-18T06:35:18.5104445Z fi 2024-12-18T06:35:18.5159548Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-12-18T06:35:18.5160298Z env: 2024-12-18T06:35:18.5160706Z GIT_DEFAULT_BRANCH: main 2024-12-18T06:35:18.5161289Z DOCKER_HOST: unix:///run/user/1001/docker.sock 2024-12-18T06:35:18.5162300Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon 2024-12-18T06:35:18.5163226Z AWS_DEFAULT_REGION: us-east-1 2024-12-18T06:35:18.5163747Z AWS_REGION: us-east-1 2024-12-18T06:35:18.5164295Z AWS_ACCESS_KEY_ID: *** 2024-12-18T06:35:18.5164997Z AWS_SECRET_ACCESS_KEY: *** 2024-12-18T06:35:18.5175428Z AWS_SESSION_TOKEN: *** 2024-12-18T06:35:18.5176202Z CONTAINER_NAME: a195586eb8a191d75ed1195cd70100d037bba4a1d97216b76c17cc218bc57f83 2024-12-18T06:35:18.5177037Z ##[endgroup] 2024-12-18T06:35:18.5265230Z + diskspace_cutoff=70 2024-12-18T06:35:18.5273727Z ++ docker info -f '{{.DockerRootDir}}' 2024-12-18T06:35:18.5801851Z + docker_root_dir=/home/pytorchci/.local/share/docker 2024-12-18T06:35:18.5807920Z ++ df -H --output=pcent /home/pytorchci/.local/share/docker 2024-12-18T06:35:18.5808385Z ++ sed -n 2p 2024-12-18T06:35:18.5810992Z ++ sed s/%// 2024-12-18T06:35:18.5811397Z ++ sed 's/ //' 2024-12-18T06:35:18.5827121Z + diskspace=55 2024-12-18T06:35:18.5827661Z + msg='Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified' 2024-12-18T06:35:18.5828174Z + [[ 55 -ge 70 ]] 2024-12-18T06:35:18.5875076Z Post job cleanup. 2024-12-18T06:35:18.5917105Z Post job cleanup. 2024-12-18T06:35:18.6261276Z Logging out of registry 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-12-18T06:35:18.6547000Z Post job cleanup. 2024-12-18T06:35:18.7757353Z Post job cleanup. 2024-12-18T06:35:18.7844024Z Post job cleanup. 2024-12-18T06:35:18.7917314Z Post job cleanup. 2024-12-18T06:35:18.8601740Z [command]/usr/bin/git version 2024-12-18T06:35:18.8635720Z git version 2.34.1 2024-12-18T06:35:18.8672625Z Temporarily overriding HOME='/home/pytorchci/actions-runner/_work/_temp/e62dcd2d-e3d3-4c4c-822a-74885fc63ad5' before making global git config changes 2024-12-18T06:35:18.8674426Z Adding repository directory to the temporary git global config as a safe directory 2024-12-18T06:35:18.8675830Z [command]/usr/bin/git config --global --add safe.directory /home/pytorchci/actions-runner/_work/pytorch/pytorch 2024-12-18T06:35:18.8701301Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2024-12-18T06:35:18.8743653Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2024-12-18T06:35:18.9072685Z Entering 'android/libs/fbjni' 2024-12-18T06:35:18.9132432Z Entering 'third_party/FP16' 2024-12-18T06:35:18.9176533Z Entering 'third_party/FXdiv' 2024-12-18T06:35:18.9224343Z Entering 'third_party/NNPACK' 2024-12-18T06:35:18.9269610Z Entering 'third_party/NVTX' 2024-12-18T06:35:18.9317531Z Entering 'third_party/VulkanMemoryAllocator' 2024-12-18T06:35:18.9358823Z Entering 'third_party/XNNPACK' 2024-12-18T06:35:18.9411899Z Entering 'third_party/benchmark' 2024-12-18T06:35:18.9463481Z Entering 'third_party/composable_kernel' 2024-12-18T06:35:18.9510607Z Entering 'third_party/cpp-httplib' 2024-12-18T06:35:18.9556554Z Entering 'third_party/cpuinfo' 2024-12-18T06:35:18.9599386Z Entering 'third_party/cudnn_frontend' 2024-12-18T06:35:18.9641566Z Entering 'third_party/cutlass' 2024-12-18T06:35:18.9701735Z Entering 'third_party/eigen' 2024-12-18T06:35:18.9760720Z Entering 'third_party/fbgemm' 2024-12-18T06:35:18.9802323Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-12-18T06:35:18.9850602Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-12-18T06:35:18.9892801Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-12-18T06:35:18.9940224Z Entering 'third_party/fbgemm/third_party/googletest' 2024-12-18T06:35:18.9980577Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-12-18T06:35:19.0025555Z Entering 'third_party/flatbuffers' 2024-12-18T06:35:19.0077560Z Entering 'third_party/fmt' 2024-12-18T06:35:19.0117889Z Entering 'third_party/gemmlowp/gemmlowp' 2024-12-18T06:35:19.0169845Z Entering 'third_party/gloo' 2024-12-18T06:35:19.0216612Z Entering 'third_party/googletest' 2024-12-18T06:35:19.0265540Z Entering 'third_party/ideep' 2024-12-18T06:35:19.0305759Z Entering 'third_party/ideep/mkl-dnn' 2024-12-18T06:35:19.0359242Z Entering 'third_party/ittapi' 2024-12-18T06:35:19.0412834Z Entering 'third_party/kineto' 2024-12-18T06:35:19.0457073Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-12-18T06:35:19.0502354Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-12-18T06:35:19.0546751Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-12-18T06:35:19.0591361Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-12-18T06:35:19.0639080Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-12-18T06:35:19.0681611Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-12-18T06:35:19.0725944Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-12-18T06:35:19.0772863Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-12-18T06:35:19.0816578Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-12-18T06:35:19.0855790Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-12-18T06:35:19.0904557Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-12-18T06:35:19.0946368Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-12-18T06:35:19.0988429Z Entering 'third_party/mimalloc' 2024-12-18T06:35:19.1027129Z Entering 'third_party/nccl/nccl' 2024-12-18T06:35:19.1066656Z Entering 'third_party/nlohmann' 2024-12-18T06:35:19.1121062Z Entering 'third_party/onnx' 2024-12-18T06:35:19.1184433Z Entering 'third_party/onnx/third_party/pybind11' 2024-12-18T06:35:19.1236846Z Entering 'third_party/opentelemetry-cpp' 2024-12-18T06:35:19.1292813Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-12-18T06:35:19.1336434Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-12-18T06:35:19.1377618Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-12-18T06:35:19.1418271Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-12-18T06:35:19.1461281Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-12-18T06:35:19.1505002Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-12-18T06:35:19.1546304Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-12-18T06:35:19.1582754Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-12-18T06:35:19.1627654Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-12-18T06:35:19.1671310Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-12-18T06:35:19.1732463Z Entering 'third_party/pocketfft' 2024-12-18T06:35:19.1779337Z Entering 'third_party/protobuf' 2024-12-18T06:35:19.1826986Z Entering 'third_party/protobuf/third_party/benchmark' 2024-12-18T06:35:19.1870387Z Entering 'third_party/protobuf/third_party/googletest' 2024-12-18T06:35:19.1916154Z Entering 'third_party/psimd' 2024-12-18T06:35:19.1958472Z Entering 'third_party/pthreadpool' 2024-12-18T06:35:19.2000016Z Entering 'third_party/pybind11' 2024-12-18T06:35:19.2042455Z Entering 'third_party/python-peachpy' 2024-12-18T06:35:19.2093011Z Entering 'third_party/sleef' 2024-12-18T06:35:19.2135663Z Entering 'third_party/tensorpipe' 2024-12-18T06:35:19.2174233Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-12-18T06:35:19.2219957Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-12-18T06:35:19.2261730Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-12-18T06:35:19.2302256Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-12-18T06:35:19.2340796Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-12-18T06:35:19.2406630Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2024-12-18T06:35:19.2425523Z http.https://github.com/.extraheader 2024-12-18T06:35:19.2433502Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2024-12-18T06:35:19.2471451Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2024-12-18T06:35:19.2728163Z Entering 'android/libs/fbjni' 2024-12-18T06:35:19.2756718Z http.https://github.com/.extraheader 2024-12-18T06:35:19.2800749Z Entering 'third_party/FP16' 2024-12-18T06:35:19.2823275Z http.https://github.com/.extraheader 2024-12-18T06:35:19.2855281Z Entering 'third_party/FXdiv' 2024-12-18T06:35:19.2877546Z http.https://github.com/.extraheader 2024-12-18T06:35:19.2916639Z Entering 'third_party/NNPACK' 2024-12-18T06:35:19.2939820Z http.https://github.com/.extraheader 2024-12-18T06:35:19.2979802Z Entering 'third_party/NVTX' 2024-12-18T06:35:19.3003298Z http.https://github.com/.extraheader 2024-12-18T06:35:19.3039116Z Entering 'third_party/VulkanMemoryAllocator' 2024-12-18T06:35:19.3064514Z http.https://github.com/.extraheader 2024-12-18T06:35:19.3095647Z Entering 'third_party/XNNPACK' 2024-12-18T06:35:19.3117426Z http.https://github.com/.extraheader 2024-12-18T06:35:19.3163016Z Entering 'third_party/benchmark' 2024-12-18T06:35:19.3186291Z http.https://github.com/.extraheader 2024-12-18T06:35:19.3221395Z Entering 'third_party/composable_kernel' 2024-12-18T06:35:19.3246073Z http.https://github.com/.extraheader 2024-12-18T06:35:19.3281120Z Entering 'third_party/cpp-httplib' 2024-12-18T06:35:19.3304867Z http.https://github.com/.extraheader 2024-12-18T06:35:19.3334918Z Entering 'third_party/cpuinfo' 2024-12-18T06:35:19.3357063Z http.https://github.com/.extraheader 2024-12-18T06:35:19.3387261Z Entering 'third_party/cudnn_frontend' 2024-12-18T06:35:19.3407368Z http.https://github.com/.extraheader 2024-12-18T06:35:19.3441928Z Entering 'third_party/cutlass' 2024-12-18T06:35:19.3469016Z http.https://github.com/.extraheader 2024-12-18T06:35:19.3505653Z Entering 'third_party/eigen' 2024-12-18T06:35:19.3530298Z http.https://github.com/.extraheader 2024-12-18T06:35:19.3568274Z Entering 'third_party/fbgemm' 2024-12-18T06:35:19.3589395Z http.https://github.com/.extraheader 2024-12-18T06:35:19.3623080Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-12-18T06:35:19.3643839Z http.https://github.com/.extraheader 2024-12-18T06:35:19.3691911Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-12-18T06:35:19.3712779Z http.https://github.com/.extraheader 2024-12-18T06:35:19.3750704Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-12-18T06:35:19.3772359Z http.https://github.com/.extraheader 2024-12-18T06:35:19.3812481Z Entering 'third_party/fbgemm/third_party/googletest' 2024-12-18T06:35:19.3834863Z http.https://github.com/.extraheader 2024-12-18T06:35:19.3867001Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-12-18T06:35:19.3887998Z http.https://github.com/.extraheader 2024-12-18T06:35:19.3922074Z Entering 'third_party/flatbuffers' 2024-12-18T06:35:19.3942461Z http.https://github.com/.extraheader 2024-12-18T06:35:19.3977319Z Entering 'third_party/fmt' 2024-12-18T06:35:19.4000251Z http.https://github.com/.extraheader 2024-12-18T06:35:19.4038973Z Entering 'third_party/gemmlowp/gemmlowp' 2024-12-18T06:35:19.4066596Z http.https://github.com/.extraheader 2024-12-18T06:35:19.4103807Z Entering 'third_party/gloo' 2024-12-18T06:35:19.4125910Z http.https://github.com/.extraheader 2024-12-18T06:35:19.4161086Z Entering 'third_party/googletest' 2024-12-18T06:35:19.4181851Z http.https://github.com/.extraheader 2024-12-18T06:35:19.4213693Z Entering 'third_party/ideep' 2024-12-18T06:35:19.4240894Z http.https://github.com/.extraheader 2024-12-18T06:35:19.4274267Z Entering 'third_party/ideep/mkl-dnn' 2024-12-18T06:35:19.4296662Z http.https://github.com/.extraheader 2024-12-18T06:35:19.4339612Z Entering 'third_party/ittapi' 2024-12-18T06:35:19.4362023Z http.https://github.com/.extraheader 2024-12-18T06:35:19.4396828Z Entering 'third_party/kineto' 2024-12-18T06:35:19.4418770Z http.https://github.com/.extraheader 2024-12-18T06:35:19.4453821Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-12-18T06:35:19.4479135Z http.https://github.com/.extraheader 2024-12-18T06:35:19.4514860Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-12-18T06:35:19.4534740Z http.https://github.com/.extraheader 2024-12-18T06:35:19.4571775Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-12-18T06:35:19.4593736Z http.https://github.com/.extraheader 2024-12-18T06:35:19.4626782Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-12-18T06:35:19.4647086Z http.https://github.com/.extraheader 2024-12-18T06:35:19.4677443Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-12-18T06:35:19.4700576Z http.https://github.com/.extraheader 2024-12-18T06:35:19.4728861Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-12-18T06:35:19.4751006Z http.https://github.com/.extraheader 2024-12-18T06:35:19.4787851Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-12-18T06:35:19.4809089Z http.https://github.com/.extraheader 2024-12-18T06:35:19.4844061Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-12-18T06:35:19.4868777Z http.https://github.com/.extraheader 2024-12-18T06:35:19.4904022Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-12-18T06:35:19.4924941Z http.https://github.com/.extraheader 2024-12-18T06:35:19.4959462Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-12-18T06:35:19.4982779Z http.https://github.com/.extraheader 2024-12-18T06:35:19.5025632Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-12-18T06:35:19.5046433Z http.https://github.com/.extraheader 2024-12-18T06:35:19.5078925Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-12-18T06:35:19.5099453Z http.https://github.com/.extraheader 2024-12-18T06:35:19.5130506Z Entering 'third_party/mimalloc' 2024-12-18T06:35:19.5157137Z http.https://github.com/.extraheader 2024-12-18T06:35:19.5202076Z Entering 'third_party/nccl/nccl' 2024-12-18T06:35:19.5230596Z http.https://github.com/.extraheader 2024-12-18T06:35:19.5260249Z Entering 'third_party/nlohmann' 2024-12-18T06:35:19.5284498Z http.https://github.com/.extraheader 2024-12-18T06:35:19.5315004Z Entering 'third_party/onnx' 2024-12-18T06:35:19.5338565Z http.https://github.com/.extraheader 2024-12-18T06:35:19.5387480Z Entering 'third_party/onnx/third_party/pybind11' 2024-12-18T06:35:19.5410428Z http.https://github.com/.extraheader 2024-12-18T06:35:19.5447840Z Entering 'third_party/opentelemetry-cpp' 2024-12-18T06:35:19.5472023Z http.https://github.com/.extraheader 2024-12-18T06:35:19.5505938Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-12-18T06:35:19.5531443Z http.https://github.com/.extraheader 2024-12-18T06:35:19.5565836Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-12-18T06:35:19.5588787Z http.https://github.com/.extraheader 2024-12-18T06:35:19.5621166Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-12-18T06:35:19.5641760Z http.https://github.com/.extraheader 2024-12-18T06:35:19.5677771Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-12-18T06:35:19.5698211Z http.https://github.com/.extraheader 2024-12-18T06:35:19.5728516Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-12-18T06:35:19.5748522Z http.https://github.com/.extraheader 2024-12-18T06:35:19.5782945Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-12-18T06:35:19.5812138Z http.https://github.com/.extraheader 2024-12-18T06:35:19.5854347Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-12-18T06:35:19.5877797Z http.https://github.com/.extraheader 2024-12-18T06:35:19.5906247Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-12-18T06:35:19.5935230Z http.https://github.com/.extraheader 2024-12-18T06:35:19.5974788Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-12-18T06:35:19.5996732Z http.https://github.com/.extraheader 2024-12-18T06:35:19.6034835Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-12-18T06:35:19.6059300Z http.https://github.com/.extraheader 2024-12-18T06:35:19.6112867Z Entering 'third_party/pocketfft' 2024-12-18T06:35:19.6137614Z http.https://github.com/.extraheader 2024-12-18T06:35:19.6170582Z Entering 'third_party/protobuf' 2024-12-18T06:35:19.6195405Z http.https://github.com/.extraheader 2024-12-18T06:35:19.6236202Z Entering 'third_party/protobuf/third_party/benchmark' 2024-12-18T06:35:19.6262570Z http.https://github.com/.extraheader 2024-12-18T06:35:19.6292414Z Entering 'third_party/protobuf/third_party/googletest' 2024-12-18T06:35:19.6313770Z http.https://github.com/.extraheader 2024-12-18T06:35:19.6355958Z Entering 'third_party/psimd' 2024-12-18T06:35:19.6383027Z http.https://github.com/.extraheader 2024-12-18T06:35:19.6426619Z Entering 'third_party/pthreadpool' 2024-12-18T06:35:19.6452277Z http.https://github.com/.extraheader 2024-12-18T06:35:19.6492807Z Entering 'third_party/pybind11' 2024-12-18T06:35:19.6519334Z http.https://github.com/.extraheader 2024-12-18T06:35:19.6552778Z Entering 'third_party/python-peachpy' 2024-12-18T06:35:19.6578895Z http.https://github.com/.extraheader 2024-12-18T06:35:19.6606694Z Entering 'third_party/sleef' 2024-12-18T06:35:19.6631820Z http.https://github.com/.extraheader 2024-12-18T06:35:19.6665310Z Entering 'third_party/tensorpipe' 2024-12-18T06:35:19.6687424Z http.https://github.com/.extraheader 2024-12-18T06:35:19.6719925Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-12-18T06:35:19.6740221Z http.https://github.com/.extraheader 2024-12-18T06:35:19.6772117Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-12-18T06:35:19.6793311Z http.https://github.com/.extraheader 2024-12-18T06:35:19.6822056Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-12-18T06:35:19.6844024Z http.https://github.com/.extraheader 2024-12-18T06:35:19.6876452Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-12-18T06:35:19.6901383Z http.https://github.com/.extraheader 2024-12-18T06:35:19.6929142Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-12-18T06:35:19.6953950Z http.https://github.com/.extraheader 2024-12-18T06:35:19.7169738Z Cleaning up orphan processes