2022-09-27T15:36:23.0232533Z Requested labels: linux.8xlarge.nvidia.gpu 2022-09-27T15:36:23.0232600Z Job defined at: pytorch/pytorch/.github/workflows/_linux-test.yml@refs/pull/85462/merge 2022-09-27T15:36:23.0232623Z Waiting for a runner to pick up this job... 2022-09-27T15:36:41.3805614Z Job is about to start running on the runner: i-0f5565a17788248fc (organization) 2022-09-27T15:36:46.2242339Z Current runner version: '2.296.2' 2022-09-27T15:36:46.2249937Z Runner name: 'i-0f5565a17788248fc' 2022-09-27T15:36:46.2250681Z Runner group name: 'Default' 2022-09-27T15:36:46.2251506Z Machine name: 'ip-10-0-6-59' 2022-09-27T15:36:46.2254184Z ##[group]GITHUB_TOKEN Permissions 2022-09-27T15:36:46.2255054Z Actions: read 2022-09-27T15:36:46.2255553Z Checks: read 2022-09-27T15:36:46.2255988Z Contents: read 2022-09-27T15:36:46.2256369Z Deployments: read 2022-09-27T15:36:46.2256831Z Discussions: read 2022-09-27T15:36:46.2257292Z Issues: read 2022-09-27T15:36:46.2257659Z Metadata: read 2022-09-27T15:36:46.2258092Z Packages: read 2022-09-27T15:36:46.2258528Z Pages: read 2022-09-27T15:36:46.2258912Z PullRequests: read 2022-09-27T15:36:46.2259423Z RepositoryProjects: read 2022-09-27T15:36:46.2259908Z SecurityEvents: read 2022-09-27T15:36:46.2260296Z Statuses: read 2022-09-27T15:36:46.2260735Z ##[endgroup] 2022-09-27T15:36:46.2264878Z Secret source: None 2022-09-27T15:36:46.2265753Z Prepare workflow directory 2022-09-27T15:36:46.3555983Z Prepare all required actions 2022-09-27T15:36:46.3781033Z Getting action download info 2022-09-27T15:36:46.6384760Z Download action repository 'pytorch/pytorch@master' (SHA:15c52ffc4f9a02f7078033677d44ccd760107952) 2022-09-27T15:36:49.9232380Z Download action repository 'nick-fields/retry@7d4a37704547a311dbb66ebdf5b23ec19374a767' (SHA:7d4a37704547a311dbb66ebdf5b23ec19374a767) 2022-09-27T15:36:50.0469302Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:3c1d75049465d7dfa70acca6d80b9c5c06ff4886) 2022-09-27T15:36:50.3349694Z Getting action download info 2022-09-27T15:36:50.5013363Z Download action repository 'malfet/checkout@silent-checkout' (SHA:f63e9e15406be6060f159846cd2e098f759c5246) 2022-09-27T15:36:50.9292978Z Uses: pytorch/pytorch/.github/workflows/_linux-test.yml 2022-09-27T15:36:50.9295484Z ##[group] Inputs 2022-09-27T15:36:50.9295862Z build-environment: linux-bionic-cuda11.6-py3.10-gcc7 2022-09-27T15:36:50.9297032Z test-matrix: { include: [ { config: "default", shard: 1, num_shards: 4, runner: "linux.4xlarge.nvidia.gpu" }, { config: "default", shard: 2, num_shards: 4, runner: "linux.4xlarge.nvidia.gpu" }, { config: "default", shard: 3, num_shards: 4, runner: "linux.4xlarge.nvidia.gpu" }, { config: "default", shard: 4, num_shards: 4, runner: "linux.4xlarge.nvidia.gpu" }, { config: "distributed", shard: 1, num_shards: 3, runner: "linux.8xlarge.nvidia.gpu" }, { config: "distributed", shard: 2, num_shards: 3, runner: "linux.8xlarge.nvidia.gpu" }, { config: "distributed", shard: 3, num_shards: 3, runner: "linux.8xlarge.nvidia.gpu" }, { config: "functorch", shard: 1, num_shards: 1, runner: "linux.4xlarge.nvidia.gpu" }, ]} 2022-09-27T15:36:50.9298286Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:e66cf5fa0a4d4ed512901b12ccdab95cca946a29 2022-09-27T15:36:50.9298755Z sync-tag: 2022-09-27T15:36:50.9298986Z ##[endgroup] 2022-09-27T15:36:50.9299830Z Complete job name: linux-bionic-cuda11.6-py3.10-gcc7 / test (distributed, 2, 3, linux.8xlarge.nvidia.gpu) 2022-09-27T15:36:51.0387322Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@master 2022-09-27T15:36:51.0387718Z with: 2022-09-27T15:36:51.0387980Z submodules: recursive 2022-09-27T15:36:51.0388221Z fetch-depth: 0 2022-09-27T15:36:51.0388451Z env: 2022-09-27T15:36:51.0388689Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:36:51.0388934Z ##[endgroup] 2022-09-27T15:36:51.0690514Z ##[group]Run retry () { 2022-09-27T15:36:51.0690828Z retry () { 2022-09-27T15:36:51.0691141Z  $* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*) 2022-09-27T15:36:51.0691437Z } 2022-09-27T15:36:51.0691688Z echo "${GITHUB_WORKSPACE}" 2022-09-27T15:36:51.0691993Z if [ -z "${NO_SUDO}" ]; then 2022-09-27T15:36:51.0692301Z  retry sudo rm -rf "${GITHUB_WORKSPACE}" 2022-09-27T15:36:51.0692582Z else 2022-09-27T15:36:51.0692837Z  retry rm -rf "${GITHUB_WORKSPACE}" 2022-09-27T15:36:51.0693320Z fi 2022-09-27T15:36:51.0693658Z mkdir "${GITHUB_WORKSPACE}" 2022-09-27T15:36:51.0711637Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-09-27T15:36:51.0711976Z env: 2022-09-27T15:36:51.0712228Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:36:51.0712470Z NO_SUDO: 2022-09-27T15:36:51.0712710Z ##[endgroup] 2022-09-27T15:36:51.0946650Z /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-09-27T15:36:51.1574743Z ##[group]Run malfet/checkout@silent-checkout 2022-09-27T15:36:51.1575033Z with: 2022-09-27T15:36:51.1575300Z ref: 52424e2bf38e454d535881fed9628d3e20f4f944 2022-09-27T15:36:51.1575579Z fetch-depth: 0 2022-09-27T15:36:51.1575815Z submodules: recursive 2022-09-27T15:36:51.1576080Z quiet-checkout: true 2022-09-27T15:36:51.1576358Z repository: pytorch/pytorch 2022-09-27T15:36:51.1576792Z token: *** 2022-09-27T15:36:51.1577031Z ssh-strict: true 2022-09-27T15:36:51.1577446Z persist-credentials: true 2022-09-27T15:36:51.1577726Z clean: true 2022-09-27T15:36:51.1577946Z lfs: false 2022-09-27T15:36:51.1578204Z set-safe-directory: true 2022-09-27T15:36:51.1578452Z env: 2022-09-27T15:36:51.1578688Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:36:51.1578928Z ##[endgroup] 2022-09-27T15:36:51.3103690Z Syncing repository: pytorch/pytorch 2022-09-27T15:36:51.3105498Z ##[group]Getting Git version info 2022-09-27T15:36:51.3106052Z Working directory is '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2022-09-27T15:36:51.3106653Z [command]/usr/bin/git version 2022-09-27T15:36:51.3106908Z git version 2.37.1 2022-09-27T15:36:51.3120067Z ##[endgroup] 2022-09-27T15:36:51.3142139Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/5f9ae682-c239-482d-a860-a9cccf86c9a5' before making global git config changes 2022-09-27T15:36:51.3142729Z Adding repository directory to the temporary git global config as a safe directory 2022-09-27T15:36:51.3150659Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-09-27T15:36:51.3194940Z Deleting the contents of '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2022-09-27T15:36:51.3200504Z ##[group]Initializing the repository 2022-09-27T15:36:51.3206841Z [command]/usr/bin/git init /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-09-27T15:36:51.3238614Z hint: Using 'master' as the name for the initial branch. This default branch name 2022-09-27T15:36:51.3239060Z hint: is subject to change. To configure the initial branch name to use in all 2022-09-27T15:36:51.3239490Z hint: of your new repositories, which will suppress this warning, call: 2022-09-27T15:36:51.3239806Z hint: 2022-09-27T15:36:51.3240167Z hint: git config --global init.defaultBranch 2022-09-27T15:36:51.3240441Z hint: 2022-09-27T15:36:51.3240823Z hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and 2022-09-27T15:36:51.3241318Z hint: 'development'. The just-created branch can be renamed via this command: 2022-09-27T15:36:51.3241631Z hint: 2022-09-27T15:36:51.3242069Z hint: git branch -m 2022-09-27T15:36:51.3242588Z Initialized empty Git repository in /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/ 2022-09-27T15:36:51.3253449Z [command]/usr/bin/git remote add origin https://github.com/pytorch/pytorch 2022-09-27T15:36:51.3288122Z ##[endgroup] 2022-09-27T15:36:51.3288621Z ##[group]Disabling automatic garbage collection 2022-09-27T15:36:51.3294101Z [command]/usr/bin/git config --local gc.auto 0 2022-09-27T15:36:51.3325246Z ##[endgroup] 2022-09-27T15:36:51.3325685Z ##[group]Setting up auth 2022-09-27T15:36:51.3335740Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2022-09-27T15:36:51.3372822Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || : 2022-09-27T15:36:51.3687776Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2022-09-27T15:36:51.3720333Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || : 2022-09-27T15:36:51.4007727Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2022-09-27T15:36:51.4055165Z ##[endgroup] 2022-09-27T15:36:51.4055685Z ##[group]Fetching the repository 2022-09-27T15:36:51.4064241Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --quiet --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2022-09-27T15:37:41.6510039Z [command]/usr/bin/git rev-parse --verify --quiet 52424e2bf38e454d535881fed9628d3e20f4f944^{object} 2022-09-27T15:37:41.6550190Z [command]/usr/bin/git -c protocol.version=2 fetch --no-tags --prune --quiet --no-recurse-submodules origin 52424e2bf38e454d535881fed9628d3e20f4f944 2022-09-27T15:37:42.9519600Z ##[endgroup] 2022-09-27T15:37:42.9520142Z ##[group]Determining the checkout info 2022-09-27T15:37:42.9522332Z ##[endgroup] 2022-09-27T15:37:42.9522959Z ##[group]Checking out the ref 2022-09-27T15:37:42.9528859Z [command]/usr/bin/git checkout --quiet --force 52424e2bf38e454d535881fed9628d3e20f4f944 2022-09-27T15:37:44.5847802Z ##[endgroup] 2022-09-27T15:37:44.5848316Z ##[group]Setting up auth for fetching submodules 2022-09-27T15:37:44.5855708Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2022-09-27T15:37:44.5909421Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2022-09-27T15:37:44.5942749Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2022-09-27T15:37:44.5977110Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2022-09-27T15:37:44.6006671Z ##[endgroup] 2022-09-27T15:37:44.6007109Z ##[group]Fetching submodules 2022-09-27T15:37:44.6012250Z [command]/usr/bin/git submodule sync --recursive 2022-09-27T15:37:44.6332791Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2022-09-27T15:37:44.6639028Z Submodule 'android/libs/fbjni' (https://github.com/facebookincubator/fbjni.git) registered for path 'android/libs/fbjni' 2022-09-27T15:37:44.6641596Z Submodule 'third_party/NNPACK_deps/FP16' (https://github.com/Maratyszcza/FP16.git) registered for path 'third_party/FP16' 2022-09-27T15:37:44.6644328Z Submodule 'third_party/NNPACK_deps/FXdiv' (https://github.com/Maratyszcza/FXdiv.git) registered for path 'third_party/FXdiv' 2022-09-27T15:37:44.6647473Z Submodule 'third_party/NNPACK' (https://github.com/Maratyszcza/NNPACK.git) registered for path 'third_party/NNPACK' 2022-09-27T15:37:44.6650740Z Submodule 'third_party/QNNPACK' (https://github.com/pytorch/QNNPACK) registered for path 'third_party/QNNPACK' 2022-09-27T15:37:44.6654787Z Submodule 'third_party/VulkanMemoryAllocator' (https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.git) registered for path 'third_party/VulkanMemoryAllocator' 2022-09-27T15:37:44.6658011Z Submodule 'third_party/XNNPACK' (https://github.com/google/XNNPACK.git) registered for path 'third_party/XNNPACK' 2022-09-27T15:37:44.6661773Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/benchmark' 2022-09-27T15:37:44.6665651Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo.git) registered for path 'third_party/cpuinfo' 2022-09-27T15:37:44.6669799Z Submodule 'third_party/cub' (https://github.com/NVlabs/cub.git) registered for path 'third_party/cub' 2022-09-27T15:37:44.6674492Z Submodule 'third_party/cudnn_frontend' (https://github.com/NVIDIA/cudnn-frontend.git) registered for path 'third_party/cudnn_frontend' 2022-09-27T15:37:44.6678799Z Submodule 'third_party/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/cutlass' 2022-09-27T15:37:44.6683338Z Submodule 'third_party/eigen' (https://gitlab.com/libeigen/eigen.git) registered for path 'third_party/eigen' 2022-09-27T15:37:44.6687718Z Submodule 'third_party/fbgemm' (https://github.com/pytorch/fbgemm) registered for path 'third_party/fbgemm' 2022-09-27T15:37:44.6692518Z Submodule 'third_party/flatbuffers' (https://github.com/google/flatbuffers.git) registered for path 'third_party/flatbuffers' 2022-09-27T15:37:44.6697225Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/fmt' 2022-09-27T15:37:44.6702107Z Submodule 'third_party/foxi' (https://github.com/houseroad/foxi.git) registered for path 'third_party/foxi' 2022-09-27T15:37:44.6707247Z Submodule 'third_party/gemmlowp/gemmlowp' (https://github.com/google/gemmlowp.git) registered for path 'third_party/gemmlowp/gemmlowp' 2022-09-27T15:37:44.6712755Z Submodule 'third_party/gloo' (https://github.com/facebookincubator/gloo) registered for path 'third_party/gloo' 2022-09-27T15:37:44.6718292Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/googletest' 2022-09-27T15:37:44.6723952Z Submodule 'third_party/ideep' (https://github.com/intel/ideep) registered for path 'third_party/ideep' 2022-09-27T15:37:44.6729550Z Submodule 'third_party/ios-cmake' (https://github.com/Yangqing/ios-cmake.git) registered for path 'third_party/ios-cmake' 2022-09-27T15:37:44.6735361Z Submodule 'third_party/ittapi' (https://github.com/intel/ittapi.git) registered for path 'third_party/ittapi' 2022-09-27T15:37:44.6741064Z Submodule 'third_party/kineto' (https://github.com/pytorch/kineto) registered for path 'third_party/kineto' 2022-09-27T15:37:44.6747001Z Submodule 'third_party/nccl/nccl' (https://github.com/NVIDIA/nccl) registered for path 'third_party/nccl/nccl' 2022-09-27T15:37:44.6753791Z Submodule 'third_party/neon2sse' (https://github.com/intel/ARM_NEON_2_x86_SSE.git) registered for path 'third_party/neon2sse' 2022-09-27T15:37:44.6759962Z Submodule 'third_party/nlohmann' (https://github.com/nlohmann/json.git) registered for path 'third_party/nlohmann' 2022-09-27T15:37:44.6766245Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx' 2022-09-27T15:37:44.6772780Z Submodule 'third_party/onnx-tensorrt' (https://github.com/onnx/onnx-tensorrt) registered for path 'third_party/onnx-tensorrt' 2022-09-27T15:37:44.6779370Z Submodule 'third_party/pocketfft' (https://github.com/mreineck/pocketfft) registered for path 'third_party/pocketfft' 2022-09-27T15:37:44.6786081Z Submodule 'third_party/protobuf' (https://github.com/protocolbuffers/protobuf.git) registered for path 'third_party/protobuf' 2022-09-27T15:37:44.6793871Z Submodule 'third_party/NNPACK_deps/psimd' (https://github.com/Maratyszcza/psimd.git) registered for path 'third_party/psimd' 2022-09-27T15:37:44.6800830Z Submodule 'third_party/NNPACK_deps/pthreadpool' (https://github.com/Maratyszcza/pthreadpool.git) registered for path 'third_party/pthreadpool' 2022-09-27T15:37:44.6808062Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/pybind11' 2022-09-27T15:37:44.6815213Z Submodule 'third_party/python-enum' (https://github.com/PeachPy/enum34.git) registered for path 'third_party/python-enum' 2022-09-27T15:37:44.6822570Z Submodule 'third_party/python-peachpy' (https://github.com/malfet/PeachPy.git) registered for path 'third_party/python-peachpy' 2022-09-27T15:37:44.6829976Z Submodule 'third_party/python-six' (https://github.com/benjaminp/six.git) registered for path 'third_party/python-six' 2022-09-27T15:37:44.6838276Z Submodule 'third_party/sleef' (https://github.com/shibatch/sleef) registered for path 'third_party/sleef' 2022-09-27T15:37:44.6846094Z Submodule 'third_party/tbb' (https://github.com/01org/tbb) registered for path 'third_party/tbb' 2022-09-27T15:37:44.6854227Z Submodule 'third_party/tensorpipe' (https://github.com/pytorch/tensorpipe.git) registered for path 'third_party/tensorpipe' 2022-09-27T15:37:44.6862232Z Submodule 'third_party/zstd' (https://github.com/facebook/zstd.git) registered for path 'third_party/zstd' 2022-09-27T15:37:44.6890785Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/android/libs/fbjni'... 2022-09-27T15:37:44.9707576Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FP16'... 2022-09-27T15:37:45.2026873Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FXdiv'... 2022-09-27T15:37:45.4242415Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/NNPACK'... 2022-09-27T15:37:45.7278055Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/QNNPACK'... 2022-09-27T15:37:46.0137788Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/VulkanMemoryAllocator'... 2022-09-27T15:37:48.2309985Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/XNNPACK'... 2022-09-27T15:37:53.5465569Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/benchmark'... 2022-09-27T15:37:53.9824871Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cpuinfo'... 2022-09-27T15:37:54.5733814Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cub'... 2022-09-27T15:37:56.0782925Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cudnn_frontend'... 2022-09-27T15:37:57.4092038Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cutlass'... 2022-09-27T15:37:58.8296679Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/eigen'... 2022-09-27T15:38:06.7250140Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm'... 2022-09-27T15:38:07.5082796Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flatbuffers'... 2022-09-27T15:38:08.8130631Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fmt'... 2022-09-27T15:38:09.9414646Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/foxi'... 2022-09-27T15:38:10.1659150Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gemmlowp/gemmlowp'... 2022-09-27T15:38:10.7797644Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gloo'... 2022-09-27T15:38:13.2805296Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/googletest'... 2022-09-27T15:38:14.2785118Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep'... 2022-09-27T15:38:14.6847841Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ios-cmake'... 2022-09-27T15:38:14.9149546Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ittapi'... 2022-09-27T15:38:15.1696795Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto'... 2022-09-27T15:38:17.7669574Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/nccl/nccl'... 2022-09-27T15:38:18.2522416Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/neon2sse'... 2022-09-27T15:38:18.6473730Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/nlohmann'... 2022-09-27T15:38:24.7655170Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx'... 2022-09-27T15:38:26.2538693Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt'... 2022-09-27T15:38:26.7024407Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pocketfft'... 2022-09-27T15:38:26.9378384Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf'... 2022-09-27T15:38:32.9295556Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/psimd'... 2022-09-27T15:38:33.1381777Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pthreadpool'... 2022-09-27T15:38:33.4425066Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pybind11'... 2022-09-27T15:38:34.2353176Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-enum'... 2022-09-27T15:38:34.4856910Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-peachpy'... 2022-09-27T15:38:34.9188668Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-six'... 2022-09-27T15:38:35.2735084Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/sleef'... 2022-09-27T15:38:35.8511140Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tbb'... 2022-09-27T15:38:38.5046701Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe'... 2022-09-27T15:38:39.0167460Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/zstd'... 2022-09-27T15:38:41.2857853Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2022-09-27T15:38:41.2982587Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2022-09-27T15:38:41.3075312Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2022-09-27T15:38:41.3354372Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2022-09-27T15:38:41.3626553Z Submodule path 'third_party/QNNPACK': checked out '7d2a4e9931a82adc3814275b6219a03e24e36b4c' 2022-09-27T15:38:41.4073983Z Submodule path 'third_party/VulkanMemoryAllocator': checked out 'a6bfc237255a6bac1513f7c1ebde6d8aed6b5191' 2022-09-27T15:38:42.1630361Z Submodule path 'third_party/XNNPACK': checked out 'ae108ef49aa5623b896fc93d4298c49d1750d9ba' 2022-09-27T15:38:42.1879493Z Submodule path 'third_party/benchmark': checked out '0d98dba29d66e93259db7daa53a9327df767a415' 2022-09-27T15:38:42.3146905Z Submodule path 'third_party/cpuinfo': checked out '8ec7bd91ad0470e61cf38f618cc1f270dede599c' 2022-09-27T15:38:42.3544123Z Submodule path 'third_party/cub': checked out 'd106ddb991a56c3df1b6d51b2409e36ba8181ce4' 2022-09-27T15:38:42.7078631Z Submodule path 'third_party/cudnn_frontend': checked out '171a7a986f7fbd9ed71bd0cf3c7ad4f55843d6b3' 2022-09-27T15:38:43.2142508Z Submodule path 'third_party/cutlass': checked out 'b72cbf957df8cf84a6d0ff91c190ad51a9c1d24a' 2022-09-27T15:38:43.5075893Z Submodule path 'third_party/eigen': checked out '3147391d946bb4b6c68edd901f2add6ac1f31f8c' 2022-09-27T15:38:43.5634904Z Submodule path 'third_party/fbgemm': checked out '499cd22f5c2e26041e4f190f628b48478a89a030' 2022-09-27T15:38:43.5651502Z Submodule 'third_party/asmjit' (https://github.com/asmjit/asmjit.git) registered for path 'third_party/fbgemm/third_party/asmjit' 2022-09-27T15:38:43.5654575Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo) registered for path 'third_party/fbgemm/third_party/cpuinfo' 2022-09-27T15:38:43.5657581Z Submodule 'third_party/googletest' (https://github.com/google/googletest) registered for path 'third_party/fbgemm/third_party/googletest' 2022-09-27T15:38:43.5661359Z Submodule 'third_party/hipify_torch' (https://github.com/ROCmSoftwarePlatform/hipify_torch.git) registered for path 'third_party/fbgemm/third_party/hipify_torch' 2022-09-27T15:38:43.5687612Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/asmjit'... 2022-09-27T15:38:44.4167538Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/cpuinfo'... 2022-09-27T15:38:45.0006069Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/googletest'... 2022-09-27T15:38:45.9608536Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/hipify_torch'... 2022-09-27T15:38:46.2825886Z Submodule path 'third_party/fbgemm/third_party/asmjit': checked out 'd3fbf7c9bc7c1d1365a94a45614b91c5a3706b81' 2022-09-27T15:38:46.4036163Z Submodule path 'third_party/fbgemm/third_party/cpuinfo': checked out 'ed8b86a253800bafdb7b25c5c399f91bff9cb1f3' 2022-09-27T15:38:46.4719650Z Submodule path 'third_party/fbgemm/third_party/googletest': checked out 'cbf019de22c8dd37b2108da35b2748fd702d1796' 2022-09-27T15:38:46.4829223Z Submodule path 'third_party/fbgemm/third_party/hipify_torch': checked out '1840658c184f3eeba787dae0f06c45756c1daaf5' 2022-09-27T15:38:46.5845521Z Submodule path 'third_party/flatbuffers': checked out 'd0cede9c90c5257537c293517a21376408b549fa' 2022-09-27T15:38:46.6234784Z Submodule path 'third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2022-09-27T15:38:46.6336087Z Submodule path 'third_party/foxi': checked out 'c278588e34e535f0bb8f00df3880d26928038cad' 2022-09-27T15:38:46.6803406Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2022-09-27T15:38:46.7089796Z Submodule path 'third_party/gloo': checked out '5b143513263133af2b95547e97c07cebeb72bf72' 2022-09-27T15:38:46.7625085Z Submodule path 'third_party/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2022-09-27T15:38:46.7752526Z Submodule path 'third_party/ideep': checked out '77d662b313a762e82b389d3fd965e0098f12cd99' 2022-09-27T15:38:46.7768931Z Submodule 'mkl-dnn' (https://github.com/intel/mkl-dnn.git) registered for path 'third_party/ideep/mkl-dnn' 2022-09-27T15:38:46.7794878Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep/mkl-dnn'... 2022-09-27T15:38:53.9269738Z Submodule path 'third_party/ideep/mkl-dnn': checked out '888a87a954e4fddb4d81fd10858eb834f2441b46' 2022-09-27T15:38:53.9288588Z Submodule 'third_party/oneDNN' (https://github.com/oneapi-src/oneDNN.git) registered for path 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-09-27T15:38:53.9316856Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep/mkl-dnn/third_party/oneDNN'... 2022-09-27T15:39:01.1712175Z Submodule path 'third_party/ideep/mkl-dnn/third_party/oneDNN': checked out '52b5f107dd9cf10910aaa19cb47f3abf9b349815' 2022-09-27T15:39:01.1826642Z Submodule path 'third_party/ios-cmake': checked out '8abaed637d56f1337d6e1d2c4026e25c1eade724' 2022-09-27T15:39:01.1994808Z Submodule path 'third_party/ittapi': checked out '5b8a7d7422611c3a0d799fb5fc5dd4abfae35b42' 2022-09-27T15:39:01.3120394Z Submodule path 'third_party/kineto': checked out '0703c78999061b8329dfab7ec5046fc5764a5573' 2022-09-27T15:39:01.3138074Z Submodule 'libkineto/third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/fmt' 2022-09-27T15:39:01.3140990Z Submodule 'libkineto/third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/googletest' 2022-09-27T15:39:01.3167994Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/fmt'... 2022-09-27T15:39:02.6031785Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/googletest'... 2022-09-27T15:39:03.6528374Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '2591ab91c3898c9f6544fff04660276537d32ffd' 2022-09-27T15:39:03.7165439Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '7aca84427f224eeed3144123d5230d5871e93347' 2022-09-27T15:39:03.7401335Z Submodule path 'third_party/nccl/nccl': checked out 'f89fd4777d2ef9229c039ff750ae21da01626f52' 2022-09-27T15:39:03.7555660Z Submodule path 'third_party/neon2sse': checked out '97a126f08ce318023be604d03f88bf0820a9464a' 2022-09-27T15:39:03.8859260Z Submodule path 'third_party/nlohmann': checked out '87cda1d6646592ac5866dc703c8e1839046a6806' 2022-09-27T15:39:04.2002700Z Submodule path 'third_party/onnx': checked out 'f7ee1ac60d06abe8e26c9b6bbe1e3db5286b614b' 2022-09-27T15:39:04.2035101Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/onnx/third_party/benchmark' 2022-09-27T15:39:04.2038362Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx/third_party/pybind11' 2022-09-27T15:39:04.2065244Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/benchmark'... 2022-09-27T15:39:04.6367219Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/pybind11'... 2022-09-27T15:39:05.5361472Z Submodule path 'third_party/onnx/third_party/benchmark': checked out '0d98dba29d66e93259db7daa53a9327df767a415' 2022-09-27T15:39:05.5730485Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'ffa346860b306c9bbfb341aed9c14c067751feb8' 2022-09-27T15:39:05.5904249Z Submodule path 'third_party/onnx-tensorrt': checked out 'c153211418a7c57ce071d9ce2a41f8d1c85a878f' 2022-09-27T15:39:05.5921088Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx' 2022-09-27T15:39:05.5945400Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx'... 2022-09-27T15:39:07.2987186Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx': checked out '765f5ee823a67a866f4bd28a9860e81f3c811ce8' 2022-09-27T15:39:07.3009182Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-09-27T15:39:07.3012279Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-09-27T15:39:07.3039810Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark'... 2022-09-27T15:39:07.7213598Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11'... 2022-09-27T15:39:08.5953833Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark': checked out 'e776aa0275e293707b6a0901e0e8d8a8a3679508' 2022-09-27T15:39:08.6711409Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11': checked out 'a1041190c8b8ff0cd9e2f0752248ad5e3789ea0c' 2022-09-27T15:39:08.6727577Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-09-27T15:39:08.6755169Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang'... 2022-09-27T15:39:08.9710168Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2022-09-27T15:39:08.9816285Z Submodule path 'third_party/pocketfft': checked out 'ea778e37710c07723435b1be58235996d1d43a5a' 2022-09-27T15:39:09.2911785Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2022-09-27T15:39:09.2933266Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/protobuf/third_party/benchmark' 2022-09-27T15:39:09.2936288Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/protobuf/third_party/googletest' 2022-09-27T15:39:09.2963519Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/benchmark'... 2022-09-27T15:39:09.7258192Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/googletest'... 2022-09-27T15:39:10.6932294Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2022-09-27T15:39:10.7750135Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2022-09-27T15:39:10.7842299Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2022-09-27T15:39:10.7968310Z Submodule path 'third_party/pthreadpool': checked out 'a134dd5d4cee80cce15db81a72e7f929d71dd413' 2022-09-27T15:39:10.8353179Z Submodule path 'third_party/pybind11': checked out 'aa304c9c7d725ffb9d10af08a3b34cb372307020' 2022-09-27T15:39:10.8450551Z Submodule path 'third_party/python-enum': checked out '4cfedc426c4e2fc52e3f5c2b4297e15ed8d6b8c7' 2022-09-27T15:39:10.8779075Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2022-09-27T15:39:10.8882301Z Submodule path 'third_party/python-six': checked out '15e31431af97e5e64b80af0a3f598d382bcdd49a' 2022-09-27T15:39:10.9405430Z Submodule path 'third_party/sleef': checked out 'e0a003ee838b75d11763aa9c3ef17bf71a725bff' 2022-09-27T15:39:11.0744738Z Submodule path 'third_party/tbb': checked out 'a51a90bc609bb73db8ea13841b5cf7aa4344d4a9' 2022-09-27T15:39:11.1054330Z Submodule path 'third_party/tensorpipe': checked out '52791a2fd214b2a9dc5759d36725909c1daa7f2e' 2022-09-27T15:39:11.1071764Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/tensorpipe/third_party/googletest' 2022-09-27T15:39:11.1075276Z Submodule 'third_party/libnop' (https://github.com/google/libnop.git) registered for path 'third_party/tensorpipe/third_party/libnop' 2022-09-27T15:39:11.1078473Z Submodule 'third_party/libuv' (https://github.com/libuv/libuv.git) registered for path 'third_party/tensorpipe/third_party/libuv' 2022-09-27T15:39:11.1081956Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/tensorpipe/third_party/pybind11' 2022-09-27T15:39:11.1107558Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/googletest'... 2022-09-27T15:39:12.0761828Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libnop'... 2022-09-27T15:39:12.4061899Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libuv'... 2022-09-27T15:39:13.6305945Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11'... 2022-09-27T15:39:15.7267063Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2022-09-27T15:39:15.7436561Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2022-09-27T15:39:15.8217690Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '1dff88e5161cba5c59276d2070d2e304e4dcb242' 2022-09-27T15:39:15.8536790Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2022-09-27T15:39:15.8553085Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-09-27T15:39:15.8579979Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11/tools/clang'... 2022-09-27T15:39:16.0813798Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2022-09-27T15:39:16.2394162Z Submodule path 'third_party/zstd': checked out 'aec56a52fbab207fc639a1937d1e708a282edca8' 2022-09-27T15:39:16.2427635Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2022-09-27T15:39:16.2745318Z Entering 'android/libs/fbjni' 2022-09-27T15:39:16.2788501Z Entering 'third_party/FP16' 2022-09-27T15:39:16.2831673Z Entering 'third_party/FXdiv' 2022-09-27T15:39:16.2872685Z Entering 'third_party/NNPACK' 2022-09-27T15:39:16.2915774Z Entering 'third_party/QNNPACK' 2022-09-27T15:39:16.2957610Z Entering 'third_party/VulkanMemoryAllocator' 2022-09-27T15:39:16.3001209Z Entering 'third_party/XNNPACK' 2022-09-27T15:39:16.3054097Z Entering 'third_party/benchmark' 2022-09-27T15:39:16.3096083Z Entering 'third_party/cpuinfo' 2022-09-27T15:39:16.3139001Z Entering 'third_party/cub' 2022-09-27T15:39:16.3181779Z Entering 'third_party/cudnn_frontend' 2022-09-27T15:39:16.3229997Z Entering 'third_party/cutlass' 2022-09-27T15:39:16.3279371Z Entering 'third_party/eigen' 2022-09-27T15:39:16.3323055Z Entering 'third_party/fbgemm' 2022-09-27T15:39:16.3366856Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-09-27T15:39:16.3407299Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-09-27T15:39:16.3448012Z Entering 'third_party/fbgemm/third_party/googletest' 2022-09-27T15:39:16.3488939Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-09-27T15:39:16.3530901Z Entering 'third_party/flatbuffers' 2022-09-27T15:39:16.3576804Z Entering 'third_party/fmt' 2022-09-27T15:39:16.3619135Z Entering 'third_party/foxi' 2022-09-27T15:39:16.3659847Z Entering 'third_party/gemmlowp/gemmlowp' 2022-09-27T15:39:16.3701575Z Entering 'third_party/gloo' 2022-09-27T15:39:16.3743998Z Entering 'third_party/googletest' 2022-09-27T15:39:16.3786560Z Entering 'third_party/ideep' 2022-09-27T15:39:16.3827484Z Entering 'third_party/ideep/mkl-dnn' 2022-09-27T15:39:16.3870586Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-09-27T15:39:16.3918565Z Entering 'third_party/ios-cmake' 2022-09-27T15:39:16.3959665Z Entering 'third_party/ittapi' 2022-09-27T15:39:16.4000895Z Entering 'third_party/kineto' 2022-09-27T15:39:16.4042070Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-09-27T15:39:16.4082873Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-09-27T15:39:16.4124835Z Entering 'third_party/nccl/nccl' 2022-09-27T15:39:16.4165599Z Entering 'third_party/neon2sse' 2022-09-27T15:39:16.4207117Z Entering 'third_party/nlohmann' 2022-09-27T15:39:16.4250550Z Entering 'third_party/onnx' 2022-09-27T15:39:16.4305801Z Entering 'third_party/onnx/third_party/benchmark' 2022-09-27T15:39:16.4347438Z Entering 'third_party/onnx/third_party/pybind11' 2022-09-27T15:39:16.4391122Z Entering 'third_party/onnx-tensorrt' 2022-09-27T15:39:16.4432432Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-09-27T15:39:16.4480004Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-09-27T15:39:16.4522994Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-09-27T15:39:16.4563194Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-09-27T15:39:16.4609242Z Entering 'third_party/pocketfft' 2022-09-27T15:39:16.4653084Z Entering 'third_party/protobuf' 2022-09-27T15:39:16.4697349Z Entering 'third_party/protobuf/third_party/benchmark' 2022-09-27T15:39:16.4738028Z Entering 'third_party/protobuf/third_party/googletest' 2022-09-27T15:39:16.4781773Z Entering 'third_party/psimd' 2022-09-27T15:39:16.4822941Z Entering 'third_party/pthreadpool' 2022-09-27T15:39:16.4864438Z Entering 'third_party/pybind11' 2022-09-27T15:39:16.4905898Z Entering 'third_party/python-enum' 2022-09-27T15:39:16.4946873Z Entering 'third_party/python-peachpy' 2022-09-27T15:39:16.4988013Z Entering 'third_party/python-six' 2022-09-27T15:39:16.5029633Z Entering 'third_party/sleef' 2022-09-27T15:39:16.5073440Z Entering 'third_party/tbb' 2022-09-27T15:39:16.5117529Z Entering 'third_party/tensorpipe' 2022-09-27T15:39:16.5160329Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-09-27T15:39:16.5201963Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-09-27T15:39:16.5241962Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-09-27T15:39:16.5282968Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-09-27T15:39:16.5322361Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-09-27T15:39:16.5366160Z Entering 'third_party/zstd' 2022-09-27T15:39:16.5416960Z ##[endgroup] 2022-09-27T15:39:16.5419244Z ##[group]Persisting credentials for submodules 2022-09-27T15:39:16.5426148Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || : 2022-09-27T15:39:16.5730881Z Entering 'android/libs/fbjni' 2022-09-27T15:39:16.5772492Z Entering 'third_party/FP16' 2022-09-27T15:39:16.5813784Z Entering 'third_party/FXdiv' 2022-09-27T15:39:16.5855104Z Entering 'third_party/NNPACK' 2022-09-27T15:39:16.5897591Z Entering 'third_party/QNNPACK' 2022-09-27T15:39:16.5939135Z Entering 'third_party/VulkanMemoryAllocator' 2022-09-27T15:39:16.5980891Z Entering 'third_party/XNNPACK' 2022-09-27T15:39:16.6033217Z Entering 'third_party/benchmark' 2022-09-27T15:39:16.6074684Z Entering 'third_party/cpuinfo' 2022-09-27T15:39:16.6116682Z Entering 'third_party/cub' 2022-09-27T15:39:16.6158073Z Entering 'third_party/cudnn_frontend' 2022-09-27T15:39:16.6204270Z Entering 'third_party/cutlass' 2022-09-27T15:39:16.6252571Z Entering 'third_party/eigen' 2022-09-27T15:39:16.6295601Z Entering 'third_party/fbgemm' 2022-09-27T15:39:16.6337019Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-09-27T15:39:16.6377346Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-09-27T15:39:16.6418949Z Entering 'third_party/fbgemm/third_party/googletest' 2022-09-27T15:39:16.6458997Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-09-27T15:39:16.6500236Z Entering 'third_party/flatbuffers' 2022-09-27T15:39:16.6543111Z Entering 'third_party/fmt' 2022-09-27T15:39:16.6584071Z Entering 'third_party/foxi' 2022-09-27T15:39:16.6625587Z Entering 'third_party/gemmlowp/gemmlowp' 2022-09-27T15:39:16.6665944Z Entering 'third_party/gloo' 2022-09-27T15:39:16.6707220Z Entering 'third_party/googletest' 2022-09-27T15:39:16.6748735Z Entering 'third_party/ideep' 2022-09-27T15:39:16.6788909Z Entering 'third_party/ideep/mkl-dnn' 2022-09-27T15:39:16.6830113Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-09-27T15:39:16.6878416Z Entering 'third_party/ios-cmake' 2022-09-27T15:39:16.6918922Z Entering 'third_party/ittapi' 2022-09-27T15:39:16.6960424Z Entering 'third_party/kineto' 2022-09-27T15:39:16.7001753Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-09-27T15:39:16.7042107Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-09-27T15:39:16.7084591Z Entering 'third_party/nccl/nccl' 2022-09-27T15:39:16.7124873Z Entering 'third_party/neon2sse' 2022-09-27T15:39:16.7165079Z Entering 'third_party/nlohmann' 2022-09-27T15:39:16.7206400Z Entering 'third_party/onnx' 2022-09-27T15:39:16.7259710Z Entering 'third_party/onnx/third_party/benchmark' 2022-09-27T15:39:16.7300849Z Entering 'third_party/onnx/third_party/pybind11' 2022-09-27T15:39:16.7343499Z Entering 'third_party/onnx-tensorrt' 2022-09-27T15:39:16.7384427Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-09-27T15:39:16.7431872Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-09-27T15:39:16.7471962Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-09-27T15:39:16.7513253Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-09-27T15:39:16.7557998Z Entering 'third_party/pocketfft' 2022-09-27T15:39:16.7599168Z Entering 'third_party/protobuf' 2022-09-27T15:39:16.7643529Z Entering 'third_party/protobuf/third_party/benchmark' 2022-09-27T15:39:16.7684081Z Entering 'third_party/protobuf/third_party/googletest' 2022-09-27T15:39:16.7726563Z Entering 'third_party/psimd' 2022-09-27T15:39:16.7768356Z Entering 'third_party/pthreadpool' 2022-09-27T15:39:16.7808838Z Entering 'third_party/pybind11' 2022-09-27T15:39:16.7849509Z Entering 'third_party/python-enum' 2022-09-27T15:39:16.7890922Z Entering 'third_party/python-peachpy' 2022-09-27T15:39:16.7931590Z Entering 'third_party/python-six' 2022-09-27T15:39:16.7972598Z Entering 'third_party/sleef' 2022-09-27T15:39:16.8013813Z Entering 'third_party/tbb' 2022-09-27T15:39:16.8056407Z Entering 'third_party/tensorpipe' 2022-09-27T15:39:16.8097054Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-09-27T15:39:16.8137716Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-09-27T15:39:16.8177638Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-09-27T15:39:16.8217951Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-09-27T15:39:16.8258657Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-09-27T15:39:16.8301300Z Entering 'third_party/zstd' 2022-09-27T15:39:16.8356035Z [command]/usr/bin/git submodule foreach --recursive git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url 2022-09-27T15:39:16.8659097Z Entering 'android/libs/fbjni' 2022-09-27T15:39:16.8696836Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2022-09-27T15:39:16.8714914Z Entering 'third_party/FP16' 2022-09-27T15:39:16.8751701Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2022-09-27T15:39:16.8769140Z Entering 'third_party/FXdiv' 2022-09-27T15:39:16.8807384Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2022-09-27T15:39:16.8824089Z Entering 'third_party/NNPACK' 2022-09-27T15:39:16.8862561Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2022-09-27T15:39:16.8880544Z Entering 'third_party/QNNPACK' 2022-09-27T15:39:16.8918055Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/QNNPACK/config remote.origin.url 2022-09-27T15:39:16.8935088Z Entering 'third_party/VulkanMemoryAllocator' 2022-09-27T15:39:16.8973583Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2022-09-27T15:39:16.8990635Z Entering 'third_party/XNNPACK' 2022-09-27T15:39:16.9028180Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2022-09-27T15:39:16.9056400Z Entering 'third_party/benchmark' 2022-09-27T15:39:16.9094643Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2022-09-27T15:39:16.9112853Z Entering 'third_party/cpuinfo' 2022-09-27T15:39:16.9149709Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2022-09-27T15:39:16.9168775Z Entering 'third_party/cub' 2022-09-27T15:39:16.9206511Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cub/config remote.origin.url 2022-09-27T15:39:16.9224706Z Entering 'third_party/cudnn_frontend' 2022-09-27T15:39:16.9262186Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2022-09-27T15:39:16.9287416Z Entering 'third_party/cutlass' 2022-09-27T15:39:16.9324305Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2022-09-27T15:39:16.9348151Z Entering 'third_party/eigen' 2022-09-27T15:39:16.9386021Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/eigen/config remote.origin.url 2022-09-27T15:39:16.9406071Z Entering 'third_party/fbgemm' 2022-09-27T15:39:16.9445417Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2022-09-27T15:39:16.9462453Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-09-27T15:39:16.9499431Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/asmjit/config remote.origin.url 2022-09-27T15:39:16.9516687Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-09-27T15:39:16.9554980Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/cpuinfo/config remote.origin.url 2022-09-27T15:39:16.9572197Z Entering 'third_party/fbgemm/third_party/googletest' 2022-09-27T15:39:16.9609943Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/googletest/config remote.origin.url 2022-09-27T15:39:16.9627345Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-09-27T15:39:16.9664135Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/hipify_torch/config remote.origin.url 2022-09-27T15:39:16.9683261Z Entering 'third_party/flatbuffers' 2022-09-27T15:39:16.9721579Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2022-09-27T15:39:16.9741379Z Entering 'third_party/fmt' 2022-09-27T15:39:16.9779526Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2022-09-27T15:39:16.9797537Z Entering 'third_party/foxi' 2022-09-27T15:39:16.9835169Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/foxi/config remote.origin.url 2022-09-27T15:39:16.9852441Z Entering 'third_party/gemmlowp/gemmlowp' 2022-09-27T15:39:16.9890574Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2022-09-27T15:39:16.9908050Z Entering 'third_party/gloo' 2022-09-27T15:39:16.9946321Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2022-09-27T15:39:16.9964355Z Entering 'third_party/googletest' 2022-09-27T15:39:17.0002784Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2022-09-27T15:39:17.0021199Z Entering 'third_party/ideep' 2022-09-27T15:39:17.0058590Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2022-09-27T15:39:17.0075648Z Entering 'third_party/ideep/mkl-dnn' 2022-09-27T15:39:17.0112075Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2022-09-27T15:39:17.0133312Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-09-27T15:39:17.0170907Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/modules/third_party/oneDNN/config remote.origin.url 2022-09-27T15:39:17.0194633Z Entering 'third_party/ios-cmake' 2022-09-27T15:39:17.0231841Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ios-cmake/config remote.origin.url 2022-09-27T15:39:17.0249524Z Entering 'third_party/ittapi' 2022-09-27T15:39:17.0287469Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2022-09-27T15:39:17.0305427Z Entering 'third_party/kineto' 2022-09-27T15:39:17.0343166Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2022-09-27T15:39:17.0360657Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-09-27T15:39:17.0398818Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2022-09-27T15:39:17.0415678Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-09-27T15:39:17.0453177Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2022-09-27T15:39:17.0471731Z Entering 'third_party/nccl/nccl' 2022-09-27T15:39:17.0510266Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nccl/nccl/config remote.origin.url 2022-09-27T15:39:17.0528497Z Entering 'third_party/neon2sse' 2022-09-27T15:39:17.0566326Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/neon2sse/config remote.origin.url 2022-09-27T15:39:17.0583682Z Entering 'third_party/nlohmann' 2022-09-27T15:39:17.0621110Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2022-09-27T15:39:17.0640359Z Entering 'third_party/onnx' 2022-09-27T15:39:17.0678595Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2022-09-27T15:39:17.0707968Z Entering 'third_party/onnx/third_party/benchmark' 2022-09-27T15:39:17.0745837Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/benchmark/config remote.origin.url 2022-09-27T15:39:17.0763225Z Entering 'third_party/onnx/third_party/pybind11' 2022-09-27T15:39:17.0800880Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2022-09-27T15:39:17.0820160Z Entering 'third_party/onnx-tensorrt' 2022-09-27T15:39:17.0858738Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/config remote.origin.url 2022-09-27T15:39:17.0876137Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-09-27T15:39:17.0913662Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/config remote.origin.url 2022-09-27T15:39:17.0935233Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-09-27T15:39:17.0973973Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/modules/third_party/benchmark/config remote.origin.url 2022-09-27T15:39:17.0991216Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-09-27T15:39:17.1029253Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2022-09-27T15:39:17.1046711Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-09-27T15:39:17.1084524Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2022-09-27T15:39:17.1106555Z Entering 'third_party/pocketfft' 2022-09-27T15:39:17.1145517Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2022-09-27T15:39:17.1162984Z Entering 'third_party/protobuf' 2022-09-27T15:39:17.1200798Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2022-09-27T15:39:17.1221484Z Entering 'third_party/protobuf/third_party/benchmark' 2022-09-27T15:39:17.1259340Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2022-09-27T15:39:17.1277409Z Entering 'third_party/protobuf/third_party/googletest' 2022-09-27T15:39:17.1315531Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2022-09-27T15:39:17.1334780Z Entering 'third_party/psimd' 2022-09-27T15:39:17.1372922Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2022-09-27T15:39:17.1390212Z Entering 'third_party/pthreadpool' 2022-09-27T15:39:17.1427909Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2022-09-27T15:39:17.1446075Z Entering 'third_party/pybind11' 2022-09-27T15:39:17.1483395Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2022-09-27T15:39:17.1501228Z Entering 'third_party/python-enum' 2022-09-27T15:39:17.1539505Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-enum/config remote.origin.url 2022-09-27T15:39:17.1557540Z Entering 'third_party/python-peachpy' 2022-09-27T15:39:17.1595260Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2022-09-27T15:39:17.1613118Z Entering 'third_party/python-six' 2022-09-27T15:39:17.1651662Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-six/config remote.origin.url 2022-09-27T15:39:17.1669078Z Entering 'third_party/sleef' 2022-09-27T15:39:17.1707027Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2022-09-27T15:39:17.1725063Z Entering 'third_party/tbb' 2022-09-27T15:39:17.1762312Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tbb/config remote.origin.url 2022-09-27T15:39:17.1781722Z Entering 'third_party/tensorpipe' 2022-09-27T15:39:17.1819469Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2022-09-27T15:39:17.1837093Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-09-27T15:39:17.1874569Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2022-09-27T15:39:17.1891361Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-09-27T15:39:17.1928893Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2022-09-27T15:39:17.1945056Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-09-27T15:39:17.1983250Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2022-09-27T15:39:17.2000241Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-09-27T15:39:17.2038349Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2022-09-27T15:39:17.2054232Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-09-27T15:39:17.2092374Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2022-09-27T15:39:17.2112404Z Entering 'third_party/zstd' 2022-09-27T15:39:17.2149739Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/zstd/config remote.origin.url 2022-09-27T15:39:17.3069648Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2022-09-27T15:39:17.3377239Z Entering 'android/libs/fbjni' 2022-09-27T15:39:17.3418423Z Entering 'third_party/FP16' 2022-09-27T15:39:17.3460253Z Entering 'third_party/FXdiv' 2022-09-27T15:39:17.3502724Z Entering 'third_party/NNPACK' 2022-09-27T15:39:17.3544702Z Entering 'third_party/QNNPACK' 2022-09-27T15:39:17.3586381Z Entering 'third_party/VulkanMemoryAllocator' 2022-09-27T15:39:17.3627801Z Entering 'third_party/XNNPACK' 2022-09-27T15:39:17.3679653Z Entering 'third_party/benchmark' 2022-09-27T15:39:17.3722061Z Entering 'third_party/cpuinfo' 2022-09-27T15:39:17.3764146Z Entering 'third_party/cub' 2022-09-27T15:39:17.3805263Z Entering 'third_party/cudnn_frontend' 2022-09-27T15:39:17.3851999Z Entering 'third_party/cutlass' 2022-09-27T15:39:17.3900103Z Entering 'third_party/eigen' 2022-09-27T15:39:17.3943316Z Entering 'third_party/fbgemm' 2022-09-27T15:39:17.3985355Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-09-27T15:39:17.4026416Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-09-27T15:39:17.4067786Z Entering 'third_party/fbgemm/third_party/googletest' 2022-09-27T15:39:17.4108605Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-09-27T15:39:17.4150728Z Entering 'third_party/flatbuffers' 2022-09-27T15:39:17.4194963Z Entering 'third_party/fmt' 2022-09-27T15:39:17.4237225Z Entering 'third_party/foxi' 2022-09-27T15:39:17.4278868Z Entering 'third_party/gemmlowp/gemmlowp' 2022-09-27T15:39:17.4322249Z Entering 'third_party/gloo' 2022-09-27T15:39:17.4363534Z Entering 'third_party/googletest' 2022-09-27T15:39:17.4405945Z Entering 'third_party/ideep' 2022-09-27T15:39:17.4446232Z Entering 'third_party/ideep/mkl-dnn' 2022-09-27T15:39:17.4489026Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-09-27T15:39:17.4537653Z Entering 'third_party/ios-cmake' 2022-09-27T15:39:17.4580998Z Entering 'third_party/ittapi' 2022-09-27T15:39:17.4621551Z Entering 'third_party/kineto' 2022-09-27T15:39:17.4665301Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-09-27T15:39:17.4707935Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-09-27T15:39:17.4750428Z Entering 'third_party/nccl/nccl' 2022-09-27T15:39:17.4792024Z Entering 'third_party/neon2sse' 2022-09-27T15:39:17.4833225Z Entering 'third_party/nlohmann' 2022-09-27T15:39:17.4877090Z Entering 'third_party/onnx' 2022-09-27T15:39:17.4931471Z Entering 'third_party/onnx/third_party/benchmark' 2022-09-27T15:39:17.4972585Z Entering 'third_party/onnx/third_party/pybind11' 2022-09-27T15:39:17.5016893Z Entering 'third_party/onnx-tensorrt' 2022-09-27T15:39:17.5058899Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-09-27T15:39:17.5105089Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-09-27T15:39:17.5147593Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-09-27T15:39:17.5191421Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-09-27T15:39:17.5239383Z Entering 'third_party/pocketfft' 2022-09-27T15:39:17.5280946Z Entering 'third_party/protobuf' 2022-09-27T15:39:17.5326081Z Entering 'third_party/protobuf/third_party/benchmark' 2022-09-27T15:39:17.5367204Z Entering 'third_party/protobuf/third_party/googletest' 2022-09-27T15:39:17.5410178Z Entering 'third_party/psimd' 2022-09-27T15:39:17.5453036Z Entering 'third_party/pthreadpool' 2022-09-27T15:39:17.5496279Z Entering 'third_party/pybind11' 2022-09-27T15:39:17.5538528Z Entering 'third_party/python-enum' 2022-09-27T15:39:17.5580409Z Entering 'third_party/python-peachpy' 2022-09-27T15:39:17.5622250Z Entering 'third_party/python-six' 2022-09-27T15:39:17.5663837Z Entering 'third_party/sleef' 2022-09-27T15:39:17.5706563Z Entering 'third_party/tbb' 2022-09-27T15:39:17.5749614Z Entering 'third_party/tensorpipe' 2022-09-27T15:39:17.5791760Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-09-27T15:39:17.5833460Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-09-27T15:39:17.5874556Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-09-27T15:39:17.5915991Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-09-27T15:39:17.5957226Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-09-27T15:39:17.6001444Z Entering 'third_party/zstd' 2022-09-27T15:39:17.6057949Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2022-09-27T15:39:17.6361090Z Entering 'android/libs/fbjni' 2022-09-27T15:39:17.6403224Z Entering 'third_party/FP16' 2022-09-27T15:39:17.6444549Z Entering 'third_party/FXdiv' 2022-09-27T15:39:17.6485385Z Entering 'third_party/NNPACK' 2022-09-27T15:39:17.6527217Z Entering 'third_party/QNNPACK' 2022-09-27T15:39:17.6570036Z Entering 'third_party/VulkanMemoryAllocator' 2022-09-27T15:39:17.6612158Z Entering 'third_party/XNNPACK' 2022-09-27T15:39:17.6664164Z Entering 'third_party/benchmark' 2022-09-27T15:39:17.6705735Z Entering 'third_party/cpuinfo' 2022-09-27T15:39:17.6748565Z Entering 'third_party/cub' 2022-09-27T15:39:17.6790626Z Entering 'third_party/cudnn_frontend' 2022-09-27T15:39:17.6838993Z Entering 'third_party/cutlass' 2022-09-27T15:39:17.6887896Z Entering 'third_party/eigen' 2022-09-27T15:39:17.6932584Z Entering 'third_party/fbgemm' 2022-09-27T15:39:17.6976182Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-09-27T15:39:17.7018059Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-09-27T15:39:17.7060541Z Entering 'third_party/fbgemm/third_party/googletest' 2022-09-27T15:39:17.7101442Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-09-27T15:39:17.7144068Z Entering 'third_party/flatbuffers' 2022-09-27T15:39:17.7187893Z Entering 'third_party/fmt' 2022-09-27T15:39:17.7229764Z Entering 'third_party/foxi' 2022-09-27T15:39:17.7271533Z Entering 'third_party/gemmlowp/gemmlowp' 2022-09-27T15:39:17.7314253Z Entering 'third_party/gloo' 2022-09-27T15:39:17.7356230Z Entering 'third_party/googletest' 2022-09-27T15:39:17.7398183Z Entering 'third_party/ideep' 2022-09-27T15:39:17.7438794Z Entering 'third_party/ideep/mkl-dnn' 2022-09-27T15:39:17.7482169Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-09-27T15:39:17.7529507Z Entering 'third_party/ios-cmake' 2022-09-27T15:39:17.7571037Z Entering 'third_party/ittapi' 2022-09-27T15:39:17.7612170Z Entering 'third_party/kineto' 2022-09-27T15:39:17.7653677Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-09-27T15:39:17.7694610Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-09-27T15:39:17.7737393Z Entering 'third_party/nccl/nccl' 2022-09-27T15:39:17.7779524Z Entering 'third_party/neon2sse' 2022-09-27T15:39:17.7821482Z Entering 'third_party/nlohmann' 2022-09-27T15:39:17.7864620Z Entering 'third_party/onnx' 2022-09-27T15:39:17.7919326Z Entering 'third_party/onnx/third_party/benchmark' 2022-09-27T15:39:17.7962519Z Entering 'third_party/onnx/third_party/pybind11' 2022-09-27T15:39:17.8005720Z Entering 'third_party/onnx-tensorrt' 2022-09-27T15:39:17.8047444Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-09-27T15:39:17.8093325Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-09-27T15:39:17.8136102Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-09-27T15:39:17.8177456Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-09-27T15:39:17.8223576Z Entering 'third_party/pocketfft' 2022-09-27T15:39:17.8265654Z Entering 'third_party/protobuf' 2022-09-27T15:39:17.8311307Z Entering 'third_party/protobuf/third_party/benchmark' 2022-09-27T15:39:17.8352947Z Entering 'third_party/protobuf/third_party/googletest' 2022-09-27T15:39:17.8396212Z Entering 'third_party/psimd' 2022-09-27T15:39:17.8438625Z Entering 'third_party/pthreadpool' 2022-09-27T15:39:17.8481068Z Entering 'third_party/pybind11' 2022-09-27T15:39:17.8522999Z Entering 'third_party/python-enum' 2022-09-27T15:39:17.8564716Z Entering 'third_party/python-peachpy' 2022-09-27T15:39:17.8606153Z Entering 'third_party/python-six' 2022-09-27T15:39:17.8648395Z Entering 'third_party/sleef' 2022-09-27T15:39:17.8690267Z Entering 'third_party/tbb' 2022-09-27T15:39:17.8734082Z Entering 'third_party/tensorpipe' 2022-09-27T15:39:17.8775712Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-09-27T15:39:17.8817343Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-09-27T15:39:17.8859580Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-09-27T15:39:17.8901649Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-09-27T15:39:17.8942522Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-09-27T15:39:17.8987850Z Entering 'third_party/zstd' 2022-09-27T15:39:17.9039497Z ##[endgroup] 2022-09-27T15:39:17.9084831Z [command]/usr/bin/git log -1 --format='%H' 2022-09-27T15:39:17.9114092Z '52424e2bf38e454d535881fed9628d3e20f4f944' 2022-09-27T15:39:17.9264795Z Prepare all required actions 2022-09-27T15:39:17.9337098Z ##[group]Run ./.github/actions/setup-linux 2022-09-27T15:39:17.9337367Z env: 2022-09-27T15:39:17.9337602Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:39:17.9337838Z ##[endgroup] 2022-09-27T15:39:17.9357555Z ##[group]Run set -euo pipefail 2022-09-27T15:39:17.9357864Z set -euo pipefail 2022-09-27T15:39:17.9358148Z function get_ec2_metadata() { 2022-09-27T15:39:17.9358460Z  # Pulled from instance metadata endpoint for EC2 2022-09-27T15:39:17.9358932Z  # see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html 2022-09-27T15:39:17.9359329Z  category=$1 2022-09-27T15:39:17.9359650Z  curl -fsSL "http://169.254.169.254/latest/meta-data/${category}" 2022-09-27T15:39:17.9359935Z } 2022-09-27T15:39:17.9360353Z echo "ami-id: $(get_ec2_metadata ami-id)" 2022-09-27T15:39:17.9360734Z echo "instance-id: $(get_ec2_metadata instance-id)" 2022-09-27T15:39:17.9361088Z echo "instance-type: $(get_ec2_metadata instance-type)" 2022-09-27T15:39:17.9361422Z echo "system info $(uname -a)" 2022-09-27T15:39:17.9373893Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-09-27T15:39:17.9374167Z env: 2022-09-27T15:39:17.9374402Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:39:17.9374653Z ##[endgroup] 2022-09-27T15:39:17.9474235Z ami-id: ami-096198a0bccc6bad4 2022-09-27T15:39:17.9537609Z instance-id: i-0f5565a17788248fc 2022-09-27T15:39:17.9598658Z instance-type: g3.8xlarge 2022-09-27T15:39:17.9607120Z system info Linux ip-10-0-6-59.ec2.internal 4.14.252-195.483.amzn2.x86_64 #1 SMP Mon Nov 1 20:58:46 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux 2022-09-27T15:39:17.9628179Z ##[group]Run if systemctl is-active --quiet docker; then 2022-09-27T15:39:17.9628551Z if systemctl is-active --quiet docker; then 2022-09-27T15:39:17.9628886Z  echo "Docker daemon is running..."; 2022-09-27T15:39:17.9629201Z else 2022-09-27T15:39:17.9629517Z  echo "Starting docker deamon..." && sudo systemctl start docker; 2022-09-27T15:39:17.9629816Z fi 2022-09-27T15:39:17.9641638Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-09-27T15:39:17.9641929Z env: 2022-09-27T15:39:17.9642165Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:39:17.9642407Z ##[endgroup] 2022-09-27T15:39:17.9692248Z Docker daemon is running... 2022-09-27T15:39:17.9713452Z ##[group]Run AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\") 2022-09-27T15:39:17.9713923Z AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\") 2022-09-27T15:39:17.9714305Z retry () { "$@" || (sleep 1 && "$@") || (sleep 2 && "$@") } 2022-09-27T15:39:17.9714885Z retry aws ecr get-login*** "$AWS_DEFAULT_REGION" | docker login --username AWS \ 2022-09-27T15:39:17.9715353Z  --password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com" 2022-09-27T15:39:17.9726582Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-09-27T15:39:17.9726874Z env: 2022-09-27T15:39:17.9727117Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:39:17.9727366Z AWS_RETRY_MODE: standard 2022-09-27T15:39:17.9727620Z AWS_MAX_ATTEMPTS: 5 2022-09-27T15:39:17.9727887Z AWS_DEFAULT_REGION: us-east-1 2022-09-27T15:39:17.9728127Z ##[endgroup] 2022-09-27T15:39:18.9029796Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2022-09-27T15:39:18.9030286Z Configure a credential helper to remove this warning. See 2022-09-27T15:39:18.9031303Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2022-09-27T15:39:18.9031615Z 2022-09-27T15:39:18.9032210Z Login Succeeded 2022-09-27T15:39:18.9073978Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2022-09-27T15:39:18.9074399Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2022-09-27T15:39:18.9074867Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2022-09-27T15:39:18.9087393Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-09-27T15:39:18.9087695Z env: 2022-09-27T15:39:18.9087937Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:39:18.9088179Z ##[endgroup] 2022-09-27T15:39:18.9162873Z Prepare all required actions 2022-09-27T15:39:18.9163252Z Getting action download info 2022-09-27T15:39:19.0794727Z Download action repository 'seemethere/add-github-ssh-key@v1' (SHA:105f7619adc4054f5f1be5f79ebd354d82384638) 2022-09-27T15:39:19.2778903Z ##[group]Run ./.github/actions/setup-ssh 2022-09-27T15:39:19.2779192Z with: 2022-09-27T15:39:19.2779659Z github-secret: *** 2022-09-27T15:39:19.2779926Z env: 2022-09-27T15:39:19.2780179Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:39:19.2780423Z ##[endgroup] 2022-09-27T15:39:19.2810219Z ##[group]Run seemethere/add-github-ssh-key@v1 2022-09-27T15:39:19.2810665Z with: 2022-09-27T15:39:19.2811082Z GITHUB_TOKEN: *** 2022-09-27T15:39:19.2811352Z activate-with-label: false 2022-09-27T15:39:19.2811633Z label: with-ssh 2022-09-27T15:39:19.2811918Z remove-existing-keys: true 2022-09-27T15:39:19.2812158Z env: 2022-09-27T15:39:19.2812407Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:39:19.2812675Z ##[endgroup] 2022-09-27T15:39:19.7097295Z Grabbing public ssh keys from https://github.com/kongzii.keys 2022-09-27T15:39:19.7975424Z ~/.ssh/authorized_keys file found on node, removing ~/.ssh and starting fresh 2022-09-27T15:39:19.7996219Z Public keys pulled and installed to /home/ec2-user/.ssh/authorized_keys 2022-09-27T15:39:19.8031534Z Login using: ssh ec2-user@ec2-3-80-238-161.compute-1.amazonaws.com 2022-09-27T15:39:19.8088352Z Prepare all required actions 2022-09-27T15:39:19.8113146Z ##[group]Run ./.github/actions/pull-docker-image 2022-09-27T15:39:19.8113431Z with: 2022-09-27T15:39:19.8113922Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:e66cf5fa0a4d4ed512901b12ccdab95cca946a29 2022-09-27T15:39:19.8114385Z env: 2022-09-27T15:39:19.8114623Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:39:19.8114879Z ##[endgroup] 2022-09-27T15:39:19.8169547Z ##[group]Run retry () { "$@" || (sleep 1 && "$@") || (sleep 2 && "$@") } 2022-09-27T15:39:19.8169917Z retry () { "$@" || (sleep 1 && "$@") || (sleep 2 && "$@") } 2022-09-27T15:39:19.8170273Z # ignore output since only exit code is used for conditional 2022-09-27T15:39:19.8170631Z # only pull docker image if it's not available locally 2022-09-27T15:39:19.8171027Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2022-09-27T15:39:19.8171430Z  retry docker pull "${DOCKER_IMAGE}" 2022-09-27T15:39:19.8171685Z fi 2022-09-27T15:39:19.8183994Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-09-27T15:39:19.8184287Z env: 2022-09-27T15:39:19.8184537Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:39:19.8185035Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:e66cf5fa0a4d4ed512901b12ccdab95cca946a29 2022-09-27T15:39:19.8185515Z ##[endgroup] 2022-09-27T15:39:20.0780015Z e66cf5fa0a4d4ed512901b12ccdab95cca946a29: Pulling from pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7 2022-09-27T15:39:20.0780966Z 40dd5be53814: Pulling fs layer 2022-09-27T15:39:20.0781500Z bd44602516a4: Pulling fs layer 2022-09-27T15:39:20.0782060Z 8ebfb31ea67d: Pulling fs layer 2022-09-27T15:39:20.0782595Z 1589dc294916: Pulling fs layer 2022-09-27T15:39:20.0783112Z 2c3a764ff1ef: Pulling fs layer 2022-09-27T15:39:20.0783657Z 2fb24fb5f7cb: Pulling fs layer 2022-09-27T15:39:20.0784226Z d6e4b45751c9: Pulling fs layer 2022-09-27T15:39:20.0784729Z 98a26bc0781e: Pulling fs layer 2022-09-27T15:39:20.0784986Z 07c42b0591b2: Pulling fs layer 2022-09-27T15:39:20.0785253Z 9be88323b57e: Pulling fs layer 2022-09-27T15:39:20.0785520Z 2c7b68ade49f: Pulling fs layer 2022-09-27T15:39:20.0785784Z 44206692de1d: Pulling fs layer 2022-09-27T15:39:20.0786045Z f751461554fa: Pulling fs layer 2022-09-27T15:39:20.0786306Z 316750fef2e6: Pulling fs layer 2022-09-27T15:39:20.0786553Z c069021d810b: Pulling fs layer 2022-09-27T15:39:20.0786805Z 07c42b0591b2: Waiting 2022-09-27T15:39:20.0787067Z e0fdd58e805b: Pulling fs layer 2022-09-27T15:39:20.0787316Z 751286b45698: Pulling fs layer 2022-09-27T15:39:20.0787569Z 2fb24fb5f7cb: Waiting 2022-09-27T15:39:20.0787812Z 9be88323b57e: Waiting 2022-09-27T15:39:20.0788031Z 2c7b68ade49f: Waiting 2022-09-27T15:39:20.0788269Z d6e4b45751c9: Waiting 2022-09-27T15:39:20.0788522Z 0c8bd29be614: Pulling fs layer 2022-09-27T15:39:20.0788774Z 3bb9e7ea569e: Pulling fs layer 2022-09-27T15:39:20.0789054Z efeff9c74fbf: Pulling fs layer 2022-09-27T15:39:20.0789325Z 81a5271d43c8: Pulling fs layer 2022-09-27T15:39:20.0789557Z 44206692de1d: Waiting 2022-09-27T15:39:20.0789813Z 903ca36d4d71: Pulling fs layer 2022-09-27T15:39:20.0790264Z d52c758f8e75: Pulling fs layer 2022-09-27T15:39:20.0790517Z a4ce2fdd9133: Pulling fs layer 2022-09-27T15:39:20.0791077Z cae8823a1cd1: Pulling fs layer 2022-09-27T15:39:20.0791337Z 316750fef2e6: Waiting 2022-09-27T15:39:20.0791572Z 3298fe919163: Pulling fs layer 2022-09-27T15:39:20.0791877Z b9b9b9d06eef: Pulling fs layer 2022-09-27T15:39:20.0792131Z f751461554fa: Waiting 2022-09-27T15:39:20.0792383Z 62fa99d47769: Pulling fs layer 2022-09-27T15:39:20.0792629Z 17acc9e30503: Pulling fs layer 2022-09-27T15:39:20.0792882Z 1589dc294916: Waiting 2022-09-27T15:39:20.0793120Z 2c3a764ff1ef: Waiting 2022-09-27T15:39:20.0793348Z efeff9c74fbf: Waiting 2022-09-27T15:39:20.0793588Z 0c8bd29be614: Waiting 2022-09-27T15:39:20.0793824Z c069021d810b: Waiting 2022-09-27T15:39:20.0794159Z e8b4222e7a59: Pulling fs layer 2022-09-27T15:39:20.0794438Z b752992950f8: Pulling fs layer 2022-09-27T15:39:20.0794865Z a4ce2fdd9133: Waiting 2022-09-27T15:39:20.0795298Z cc8443c330a0: Pulling fs layer 2022-09-27T15:39:20.0795574Z c2fcfa2400df: Pulling fs layer 2022-09-27T15:39:20.0795855Z dce607cbd09e: Pulling fs layer 2022-09-27T15:39:20.0796104Z 45b253446018: Pulling fs layer 2022-09-27T15:39:20.0796355Z cae8823a1cd1: Waiting 2022-09-27T15:39:20.0796596Z d52c758f8e75: Waiting 2022-09-27T15:39:20.0796829Z 752f98c7a6d7: Pulling fs layer 2022-09-27T15:39:20.0797076Z 3298fe919163: Waiting 2022-09-27T15:39:20.0797324Z 410e31c94a04: Pulling fs layer 2022-09-27T15:39:20.0797573Z aa4bb3ec24a7: Pulling fs layer 2022-09-27T15:39:20.0797843Z 6a9eea4b3aa4: Pulling fs layer 2022-09-27T15:39:20.0798102Z dce607cbd09e: Waiting 2022-09-27T15:39:20.0798327Z e8b4222e7a59: Waiting 2022-09-27T15:39:20.0798557Z 45b253446018: Waiting 2022-09-27T15:39:20.0798805Z a823f5718e87: Pulling fs layer 2022-09-27T15:39:20.0799041Z 17acc9e30503: Waiting 2022-09-27T15:39:20.0799293Z 788ace045743: Pulling fs layer 2022-09-27T15:39:20.0799559Z c76c6ad2ac0f: Pulling fs layer 2022-09-27T15:39:20.0799814Z 752f98c7a6d7: Waiting 2022-09-27T15:39:20.0800053Z 6cd5f9a2c4ae: Pulling fs layer 2022-09-27T15:39:20.0800309Z 410e31c94a04: Waiting 2022-09-27T15:39:20.0800548Z aa4bb3ec24a7: Waiting 2022-09-27T15:39:20.0800785Z 577da355ab1b: Pulling fs layer 2022-09-27T15:39:20.0801046Z 649c4428b346: Pulling fs layer 2022-09-27T15:39:20.0801292Z a823f5718e87: Waiting 2022-09-27T15:39:20.0801506Z 788ace045743: Waiting 2022-09-27T15:39:20.0801759Z eadc05ea2cd3: Pulling fs layer 2022-09-27T15:39:20.0802026Z 93e5a7080833: Pulling fs layer 2022-09-27T15:39:20.0802256Z 903ca36d4d71: Waiting 2022-09-27T15:39:20.0802487Z 98a26bc0781e: Waiting 2022-09-27T15:39:20.0802738Z e6d72a41a09b: Pulling fs layer 2022-09-27T15:39:20.0802975Z c76c6ad2ac0f: Waiting 2022-09-27T15:39:20.0803228Z acb01049a64b: Pulling fs layer 2022-09-27T15:39:20.0803494Z 343cc73c5973: Pulling fs layer 2022-09-27T15:39:20.0803740Z 7d69e17e7339: Pulling fs layer 2022-09-27T15:39:20.0804007Z 5d0b32cc6f2a: Pulling fs layer 2022-09-27T15:39:20.0804262Z 62fa99d47769: Waiting 2022-09-27T15:39:20.0804496Z a6c12031bfcf: Pulling fs layer 2022-09-27T15:39:20.0804752Z b752992950f8: Waiting 2022-09-27T15:39:20.0804985Z 7d69e17e7339: Waiting 2022-09-27T15:39:20.0805200Z 343cc73c5973: Waiting 2022-09-27T15:39:20.0805438Z 5d0b32cc6f2a: Waiting 2022-09-27T15:39:20.0805677Z a6c12031bfcf: Waiting 2022-09-27T15:39:20.0805899Z e6d72a41a09b: Waiting 2022-09-27T15:39:20.0806135Z acb01049a64b: Waiting 2022-09-27T15:39:20.0806369Z 93e5a7080833: Waiting 2022-09-27T15:39:20.0806594Z 751286b45698: Waiting 2022-09-27T15:39:20.0806826Z b9b9b9d06eef: Waiting 2022-09-27T15:39:20.0807066Z eadc05ea2cd3: Waiting 2022-09-27T15:39:20.0807285Z 649c4428b346: Waiting 2022-09-27T15:39:20.0807517Z cc8443c330a0: Waiting 2022-09-27T15:39:20.0807753Z c2fcfa2400df: Waiting 2022-09-27T15:39:20.2254492Z bd44602516a4: Verifying Checksum 2022-09-27T15:39:20.2254819Z bd44602516a4: Download complete 2022-09-27T15:39:20.2989779Z 1589dc294916: Verifying Checksum 2022-09-27T15:39:20.2990125Z 1589dc294916: Download complete 2022-09-27T15:39:20.3732816Z 8ebfb31ea67d: Verifying Checksum 2022-09-27T15:39:20.3733636Z 8ebfb31ea67d: Download complete 2022-09-27T15:39:20.3823867Z 2c3a764ff1ef: Verifying Checksum 2022-09-27T15:39:20.3824647Z 2c3a764ff1ef: Download complete 2022-09-27T15:39:20.4129913Z 40dd5be53814: Verifying Checksum 2022-09-27T15:39:20.4130405Z 40dd5be53814: Download complete 2022-09-27T15:39:20.4526125Z d6e4b45751c9: Download complete 2022-09-27T15:39:20.5382908Z 07c42b0591b2: Verifying Checksum 2022-09-27T15:39:20.5383214Z 07c42b0591b2: Download complete 2022-09-27T15:39:20.6198422Z 9be88323b57e: Verifying Checksum 2022-09-27T15:39:20.6199064Z 9be88323b57e: Download complete 2022-09-27T15:39:21.1496972Z 40dd5be53814: Pull complete 2022-09-27T15:39:21.4329668Z bd44602516a4: Pull complete 2022-09-27T15:39:21.9715567Z 8ebfb31ea67d: Pull complete 2022-09-27T15:39:22.1206739Z 1589dc294916: Pull complete 2022-09-27T15:39:22.2254913Z 2c3a764ff1ef: Pull complete 2022-09-27T15:39:22.7454504Z 2c7b68ade49f: Download complete 2022-09-27T15:39:22.8269446Z 44206692de1d: Verifying Checksum 2022-09-27T15:39:22.8270300Z 44206692de1d: Download complete 2022-09-27T15:39:22.9027139Z f751461554fa: Verifying Checksum 2022-09-27T15:39:22.9027700Z f751461554fa: Download complete 2022-09-27T15:39:23.0050530Z 316750fef2e6: Download complete 2022-09-27T15:39:23.7694256Z c069021d810b: Verifying Checksum 2022-09-27T15:39:23.7694575Z c069021d810b: Download complete 2022-09-27T15:39:23.8519543Z e0fdd58e805b: Verifying Checksum 2022-09-27T15:39:23.8519862Z e0fdd58e805b: Download complete 2022-09-27T15:39:23.9290028Z 751286b45698: Verifying Checksum 2022-09-27T15:39:23.9290322Z 751286b45698: Download complete 2022-09-27T15:39:31.5968534Z 2fb24fb5f7cb: Verifying Checksum 2022-09-27T15:39:31.5968882Z 2fb24fb5f7cb: Download complete 2022-09-27T15:39:31.6864618Z 3bb9e7ea569e: Verifying Checksum 2022-09-27T15:39:31.6864974Z 3bb9e7ea569e: Download complete 2022-09-27T15:39:31.7754733Z efeff9c74fbf: Verifying Checksum 2022-09-27T15:39:31.7755186Z efeff9c74fbf: Download complete 2022-09-27T15:39:31.8962263Z 81a5271d43c8: Download complete 2022-09-27T15:39:31.9935794Z 903ca36d4d71: Verifying Checksum 2022-09-27T15:39:31.9936153Z 903ca36d4d71: Download complete 2022-09-27T15:39:32.0904753Z d52c758f8e75: Verifying Checksum 2022-09-27T15:39:32.0905162Z d52c758f8e75: Download complete 2022-09-27T15:39:32.1909728Z a4ce2fdd9133: Verifying Checksum 2022-09-27T15:39:32.1910350Z a4ce2fdd9133: Download complete 2022-09-27T15:39:34.7244092Z cae8823a1cd1: Verifying Checksum 2022-09-27T15:39:34.7244734Z cae8823a1cd1: Download complete 2022-09-27T15:39:34.7825437Z 98a26bc0781e: Verifying Checksum 2022-09-27T15:39:34.7825739Z 98a26bc0781e: Download complete 2022-09-27T15:39:34.8075605Z 3298fe919163: Download complete 2022-09-27T15:39:34.8827447Z b9b9b9d06eef: Download complete 2022-09-27T15:39:34.8955047Z 62fa99d47769: Verifying Checksum 2022-09-27T15:39:34.8955737Z 62fa99d47769: Download complete 2022-09-27T15:39:34.9706921Z 17acc9e30503: Verifying Checksum 2022-09-27T15:39:34.9707240Z 17acc9e30503: Download complete 2022-09-27T15:39:34.9863184Z e8b4222e7a59: Verifying Checksum 2022-09-27T15:39:34.9863606Z e8b4222e7a59: Download complete 2022-09-27T15:39:35.0740383Z cc8443c330a0: Verifying Checksum 2022-09-27T15:39:35.0741120Z cc8443c330a0: Download complete 2022-09-27T15:39:35.1501999Z c2fcfa2400df: Verifying Checksum 2022-09-27T15:39:35.1502328Z c2fcfa2400df: Download complete 2022-09-27T15:39:35.4796609Z dce607cbd09e: Verifying Checksum 2022-09-27T15:39:35.4797231Z dce607cbd09e: Download complete 2022-09-27T15:39:35.5685045Z 45b253446018: Verifying Checksum 2022-09-27T15:39:35.5685425Z 45b253446018: Download complete 2022-09-27T15:39:35.6538508Z 752f98c7a6d7: Verifying Checksum 2022-09-27T15:39:35.6539131Z 752f98c7a6d7: Download complete 2022-09-27T15:39:35.9121741Z 410e31c94a04: Download complete 2022-09-27T15:39:35.9862872Z aa4bb3ec24a7: Verifying Checksum 2022-09-27T15:39:35.9863491Z aa4bb3ec24a7: Download complete 2022-09-27T15:39:36.4517930Z 6a9eea4b3aa4: Verifying Checksum 2022-09-27T15:39:36.4518608Z 6a9eea4b3aa4: Download complete 2022-09-27T15:39:36.5391848Z a823f5718e87: Verifying Checksum 2022-09-27T15:39:36.5392467Z a823f5718e87: Download complete 2022-09-27T15:39:36.6629976Z 788ace045743: Verifying Checksum 2022-09-27T15:39:36.6630325Z 788ace045743: Download complete 2022-09-27T15:39:37.1986850Z b752992950f8: Verifying Checksum 2022-09-27T15:39:37.1987453Z b752992950f8: Download complete 2022-09-27T15:39:37.2800744Z 6cd5f9a2c4ae: Download complete 2022-09-27T15:39:37.3696629Z 577da355ab1b: Download complete 2022-09-27T15:39:37.4494621Z 649c4428b346: Verifying Checksum 2022-09-27T15:39:37.4495012Z 649c4428b346: Download complete 2022-09-27T15:39:37.5284762Z eadc05ea2cd3: Download complete 2022-09-27T15:39:37.7284896Z 93e5a7080833: Verifying Checksum 2022-09-27T15:39:37.7285410Z 93e5a7080833: Download complete 2022-09-27T15:39:37.8048809Z e6d72a41a09b: Verifying Checksum 2022-09-27T15:39:37.8049145Z e6d72a41a09b: Download complete 2022-09-27T15:39:38.4067879Z acb01049a64b: Verifying Checksum 2022-09-27T15:39:38.4068542Z acb01049a64b: Download complete 2022-09-27T15:39:38.5149250Z 343cc73c5973: Verifying Checksum 2022-09-27T15:39:38.5149594Z 343cc73c5973: Download complete 2022-09-27T15:39:39.6612821Z c76c6ad2ac0f: Verifying Checksum 2022-09-27T15:39:39.6613181Z c76c6ad2ac0f: Download complete 2022-09-27T15:39:39.7500075Z 5d0b32cc6f2a: Verifying Checksum 2022-09-27T15:39:39.7500379Z 5d0b32cc6f2a: Download complete 2022-09-27T15:39:39.8454725Z a6c12031bfcf: Verifying Checksum 2022-09-27T15:39:39.8455068Z a6c12031bfcf: Download complete 2022-09-27T15:39:45.4119491Z 2fb24fb5f7cb: Pull complete 2022-09-27T15:39:45.5120602Z d6e4b45751c9: Pull complete 2022-09-27T15:40:07.3423375Z 98a26bc0781e: Pull complete 2022-09-27T15:40:09.2194868Z 07c42b0591b2: Pull complete 2022-09-27T15:40:11.0926923Z 9be88323b57e: Pull complete 2022-09-27T15:40:17.8204322Z 0c8bd29be614: Verifying Checksum 2022-09-27T15:40:17.8204701Z 0c8bd29be614: Download complete 2022-09-27T15:40:19.0964925Z 2c7b68ade49f: Pull complete 2022-09-27T15:40:20.9413310Z 44206692de1d: Pull complete 2022-09-27T15:40:22.7873133Z f751461554fa: Pull complete 2022-09-27T15:40:24.6672287Z 316750fef2e6: Pull complete 2022-09-27T15:40:28.6710535Z c069021d810b: Pull complete 2022-09-27T15:40:30.7075556Z e0fdd58e805b: Pull complete 2022-09-27T15:40:32.6172513Z 751286b45698: Pull complete 2022-09-27T15:40:39.3388789Z 7d69e17e7339: Verifying Checksum 2022-09-27T15:40:39.3389143Z 7d69e17e7339: Download complete 2022-09-27T15:41:10.3789544Z 0c8bd29be614: Pull complete 2022-09-27T15:41:12.2239250Z 3bb9e7ea569e: Pull complete 2022-09-27T15:41:14.0606823Z efeff9c74fbf: Pull complete 2022-09-27T15:41:15.8975101Z 81a5271d43c8: Pull complete 2022-09-27T15:41:17.2850106Z 903ca36d4d71: Pull complete 2022-09-27T15:41:19.1582598Z d52c758f8e75: Pull complete 2022-09-27T15:41:21.0340097Z a4ce2fdd9133: Pull complete 2022-09-27T15:41:25.2424442Z cae8823a1cd1: Pull complete 2022-09-27T15:41:27.9634515Z 3298fe919163: Pull complete 2022-09-27T15:41:31.4917838Z b9b9b9d06eef: Pull complete 2022-09-27T15:41:34.3492371Z 62fa99d47769: Pull complete 2022-09-27T15:41:36.5939854Z 17acc9e30503: Pull complete 2022-09-27T15:41:39.1382752Z e8b4222e7a59: Pull complete 2022-09-27T15:41:47.5318208Z b752992950f8: Pull complete 2022-09-27T15:41:49.4747477Z cc8443c330a0: Pull complete 2022-09-27T15:41:51.3200212Z c2fcfa2400df: Pull complete 2022-09-27T15:41:53.8977674Z dce607cbd09e: Pull complete 2022-09-27T15:41:55.7441483Z 45b253446018: Pull complete 2022-09-27T15:41:57.6516591Z 752f98c7a6d7: Pull complete 2022-09-27T15:41:59.9057165Z 410e31c94a04: Pull complete 2022-09-27T15:42:01.7539433Z aa4bb3ec24a7: Pull complete 2022-09-27T15:42:05.4679997Z 6a9eea4b3aa4: Pull complete 2022-09-27T15:42:05.7092720Z a823f5718e87: Pull complete 2022-09-27T15:42:05.8277860Z 788ace045743: Pull complete 2022-09-27T15:42:11.7919546Z c76c6ad2ac0f: Pull complete 2022-09-27T15:42:12.0599678Z 6cd5f9a2c4ae: Pull complete 2022-09-27T15:42:12.3232599Z 577da355ab1b: Pull complete 2022-09-27T15:42:12.5799835Z 649c4428b346: Pull complete 2022-09-27T15:42:12.8365114Z eadc05ea2cd3: Pull complete 2022-09-27T15:42:15.1808311Z 93e5a7080833: Pull complete 2022-09-27T15:42:15.4317531Z e6d72a41a09b: Pull complete 2022-09-27T15:42:20.6472032Z acb01049a64b: Pull complete 2022-09-27T15:42:20.8930205Z 343cc73c5973: Pull complete 2022-09-27T15:43:04.7083158Z 7d69e17e7339: Pull complete 2022-09-27T15:43:06.5565683Z 5d0b32cc6f2a: Pull complete 2022-09-27T15:43:08.4359382Z a6c12031bfcf: Pull complete 2022-09-27T15:43:09.7836429Z Digest: sha256:9bb261bc4d8aeb82a71b1f0709da9c979e85a12a79c4a85c3fe3adddddcb2663 2022-09-27T15:43:10.2845895Z Status: Downloaded newer image for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:e66cf5fa0a4d4ed512901b12ccdab95cca946a29 2022-09-27T15:43:10.5667155Z 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:e66cf5fa0a4d4ed512901b12ccdab95cca946a29 2022-09-27T15:43:10.5753810Z ##[group]Run nick-fields/retry@7d4a37704547a311dbb66ebdf5b23ec19374a767 2022-09-27T15:43:10.5754266Z with: 2022-09-27T15:43:10.5754585Z timeout_minutes: 10 2022-09-27T15:43:10.5754913Z max_attempts: 3 2022-09-27T15:43:10.5755372Z command: set -ex bash .github/scripts/install_nvidia_utils_linux.sh echo "GPU_FLAG=--gpus all" >> "${GITHUB_ENV}" 2022-09-27T15:43:10.5755838Z retry_wait_seconds: 10 2022-09-27T15:43:10.5756128Z polling_interval_seconds: 1 2022-09-27T15:43:10.5756466Z warning_on_retry: true 2022-09-27T15:43:10.5756842Z continue_on_error: false 2022-09-27T15:43:10.5757109Z env: 2022-09-27T15:43:10.5757412Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:43:10.5757739Z ##[endgroup] 2022-09-27T15:43:10.6299500Z 2022-09-27T15:43:10.6361948Z + bash .github/scripts/install_nvidia_utils_linux.sh 2022-09-27T15:43:10.6364399Z == Installing nvidia driver NVIDIA-Linux-x86_64-515.57.run == 2022-09-27T15:43:10.6365327Z + HAS_NVIDIA_DRIVER=0 2022-09-27T15:43:10.6368271Z ++ command -v nvidia-smi 2022-09-27T15:43:10.6370629Z + '[' -x '' ']' 2022-09-27T15:43:10.6371280Z + '[' 0 -eq 0 ']' 2022-09-27T15:43:10.6371763Z + sudo yum groupinstall -y 'Development Tools' 2022-09-27T15:43:11.1059456Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-09-27T15:43:11.4168901Z Resolving Dependencies 2022-09-27T15:43:11.4173734Z --> Running transaction check 2022-09-27T15:43:11.4176594Z ---> Package autoconf.noarch 0:2.69-11.amzn2 will be installed 2022-09-27T15:43:11.4388505Z --> Processing Dependency: m4 >= 1.4.14 for package: autoconf-2.69-11.amzn2.noarch 2022-09-27T15:43:11.6397251Z --> Processing Dependency: perl(Data::Dumper) for package: autoconf-2.69-11.amzn2.noarch 2022-09-27T15:43:11.6399463Z ---> Package automake.noarch 0:1.13.4-3.1.amzn2 will be installed 2022-09-27T15:43:11.6443682Z --> Processing Dependency: perl(Thread::Queue) for package: automake-1.13.4-3.1.amzn2.noarch 2022-09-27T15:43:11.6450379Z --> Processing Dependency: perl(TAP::Parser) for package: automake-1.13.4-3.1.amzn2.noarch 2022-09-27T15:43:11.6460880Z ---> Package bison.x86_64 0:3.0.4-6.amzn2.0.2 will be installed 2022-09-27T15:43:11.6570026Z ---> Package byacc.x86_64 0:1.9.20130304-3.amzn2.0.2 will be installed 2022-09-27T15:43:11.6576912Z ---> Package cscope.x86_64 0:15.8-10.amzn2.0.2 will be installed 2022-09-27T15:43:11.6619886Z --> Processing Dependency: emacs-filesystem for package: cscope-15.8-10.amzn2.0.2.x86_64 2022-09-27T15:43:11.6643610Z ---> Package ctags.x86_64 0:5.8-13.amzn2.0.2 will be installed 2022-09-27T15:43:11.6652179Z ---> Package diffstat.x86_64 0:1.57-4.amzn2.0.2 will be installed 2022-09-27T15:43:11.6659886Z ---> Package doxygen.x86_64 1:1.8.5-4.amzn2 will be installed 2022-09-27T15:43:11.6751858Z ---> Package elfutils.x86_64 0:0.176-2.amzn2 will be installed 2022-09-27T15:43:11.6887705Z ---> Package flex.x86_64 0:2.5.37-3.amzn2.0.3 will be installed 2022-09-27T15:43:11.6905939Z ---> Package gcc.x86_64 0:7.3.1-15.amzn2 will be installed 2022-09-27T15:43:11.7070042Z --> Processing Dependency: cpp = 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-09-27T15:43:11.7089050Z --> Processing Dependency: libsanitizer >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-09-27T15:43:11.7142189Z --> Processing Dependency: libquadmath >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-09-27T15:43:11.7192428Z --> Processing Dependency: libmpx >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-09-27T15:43:11.7246644Z --> Processing Dependency: libitm >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-09-27T15:43:11.7296595Z --> Processing Dependency: libcilkrts >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-09-27T15:43:11.7350754Z --> Processing Dependency: libatomic >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-09-27T15:43:11.7404733Z --> Processing Dependency: glibc-devel >= 2.2.90-12 for package: gcc-7.3.1-15.amzn2.x86_64 2022-09-27T15:43:11.7549515Z --> Processing Dependency: libmpfr.so.4()(64bit) for package: gcc-7.3.1-15.amzn2.x86_64 2022-09-27T15:43:11.7569970Z --> Processing Dependency: libmpc.so.3()(64bit) for package: gcc-7.3.1-15.amzn2.x86_64 2022-09-27T15:43:11.7590548Z ---> Package gcc-c++.x86_64 0:7.3.1-15.amzn2 will be installed 2022-09-27T15:43:11.7616888Z ---> Package gcc-gfortran.x86_64 0:7.3.1-15.amzn2 will be installed 2022-09-27T15:43:11.7649596Z --> Processing Dependency: libgfortran.so.4()(64bit) for package: gcc-gfortran-7.3.1-15.amzn2.x86_64 2022-09-27T15:43:11.7711117Z ---> Package indent.x86_64 0:2.2.11-13.amzn2.0.2 will be installed 2022-09-27T15:43:11.7725567Z ---> Package intltool.noarch 0:0.50.2-7.amzn2 will be installed 2022-09-27T15:43:11.7775406Z --> Processing Dependency: perl(XML::Parser) for package: intltool-0.50.2-7.amzn2.noarch 2022-09-27T15:43:11.7789083Z --> Processing Dependency: gettext-devel for package: intltool-0.50.2-7.amzn2.noarch 2022-09-27T15:43:11.7807562Z ---> Package libtool.x86_64 0:2.4.2-22.2.amzn2.0.2 will be installed 2022-09-27T15:43:11.7837754Z ---> Package patch.x86_64 0:2.7.1-12.amzn2.0.2 will be installed 2022-09-27T15:43:11.7872045Z ---> Package patchutils.x86_64 0:0.3.3-4.amzn2.0.1 will be installed 2022-09-27T15:43:11.7896386Z ---> Package rcs.x86_64 0:5.9.0-5.amzn2.0.2 will be installed 2022-09-27T15:43:11.7929039Z ---> Package rpm-build.x86_64 0:4.11.3-48.amzn2.0.2 will be installed 2022-09-27T15:43:11.8161465Z --> Processing Dependency: /usr/bin/gdb-add-index for package: rpm-build-4.11.3-48.amzn2.0.2.x86_64 2022-09-27T15:43:11.8179231Z ---> Package rpm-sign.x86_64 0:4.11.3-48.amzn2.0.2 will be installed 2022-09-27T15:43:11.8202385Z ---> Package subversion.x86_64 0:1.7.14-16.amzn2.0.1 will be installed 2022-09-27T15:43:11.8364919Z --> Processing Dependency: subversion-libs(x86-64) = 1.7.14-16.amzn2.0.1 for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8383499Z --> Processing Dependency: libsvn_wc-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8384651Z --> Processing Dependency: libsvn_subr-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8385257Z --> Processing Dependency: libsvn_repos-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8386227Z --> Processing Dependency: libsvn_ra_svn-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8386852Z --> Processing Dependency: libsvn_ra_neon-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8387477Z --> Processing Dependency: libsvn_ra_local-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8388075Z --> Processing Dependency: libsvn_ra-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8388695Z --> Processing Dependency: libsvn_fs_util-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8389653Z --> Processing Dependency: libsvn_fs_fs-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8390326Z --> Processing Dependency: libsvn_fs_base-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8391425Z --> Processing Dependency: libsvn_fs-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8392057Z --> Processing Dependency: libsvn_diff-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8392668Z --> Processing Dependency: libsvn_delta-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8393291Z --> Processing Dependency: libsvn_client-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8393903Z --> Processing Dependency: libneon.so.27()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8411524Z --> Processing Dependency: libaprutil-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8432561Z --> Processing Dependency: libapr-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-09-27T15:43:11.8456003Z ---> Package swig.x86_64 0:3.0.12-11.amzn2.0.3 will be installed 2022-09-27T15:43:11.8477679Z ---> Package system-rpm-config.noarch 0:9.1.0-76.amzn2.0.14 will be installed 2022-09-27T15:43:11.8521987Z --> Processing Dependency: dwz >= 0.4 for package: system-rpm-config-9.1.0-76.amzn2.0.14.noarch 2022-09-27T15:43:11.8539192Z --> Processing Dependency: perl-srpm-macros for package: system-rpm-config-9.1.0-76.amzn2.0.14.noarch 2022-09-27T15:43:11.8551121Z --> Processing Dependency: go-srpm-macros for package: system-rpm-config-9.1.0-76.amzn2.0.14.noarch 2022-09-27T15:43:11.8712312Z ---> Package systemtap.x86_64 0:4.5-1.amzn2.0.1 will be installed 2022-09-27T15:43:11.8725000Z --> Processing Dependency: systemtap-devel = 4.5-1.amzn2.0.1 for package: systemtap-4.5-1.amzn2.0.1.x86_64 2022-09-27T15:43:11.8739180Z --> Processing Dependency: systemtap-client = 4.5-1.amzn2.0.1 for package: systemtap-4.5-1.amzn2.0.1.x86_64 2022-09-27T15:43:11.8752508Z --> Running transaction check 2022-09-27T15:43:11.8755778Z ---> Package apr.x86_64 0:1.7.0-9.amzn2 will be installed 2022-09-27T15:43:11.8826872Z ---> Package apr-util.x86_64 0:1.6.1-5.amzn2.0.2 will be installed 2022-09-27T15:43:11.8864092Z --> Processing Dependency: apr-util-bdb(x86-64) = 1.6.1-5.amzn2.0.2 for package: apr-util-1.6.1-5.amzn2.0.2.x86_64 2022-09-27T15:43:11.8878269Z ---> Package cpp.x86_64 0:7.3.1-15.amzn2 will be installed 2022-09-27T15:43:11.8949015Z ---> Package dwz.x86_64 0:0.11-3.amzn2.0.3 will be installed 2022-09-27T15:43:11.8959660Z ---> Package emacs-filesystem.noarch 1:27.2-4.amzn2.0.1 will be installed 2022-09-27T15:43:11.8960812Z ---> Package gdb.x86_64 0:8.0.1-36.amzn2.0.1 will be installed 2022-09-27T15:43:11.9028819Z ---> Package gettext-devel.x86_64 0:0.19.8.1-3.amzn2 will be installed 2022-09-27T15:43:11.9091510Z --> Processing Dependency: gettext-common-devel = 0.19.8.1-3.amzn2 for package: gettext-devel-0.19.8.1-3.amzn2.x86_64 2022-09-27T15:43:11.9100238Z ---> Package glibc-devel.x86_64 0:2.26-60.amzn2 will be installed 2022-09-27T15:43:11.9213950Z --> Processing Dependency: glibc-headers = 2.26-60.amzn2 for package: glibc-devel-2.26-60.amzn2.x86_64 2022-09-27T15:43:11.9239357Z --> Processing Dependency: glibc-headers for package: glibc-devel-2.26-60.amzn2.x86_64 2022-09-27T15:43:11.9240526Z ---> Package go-srpm-macros.noarch 0:3.0.15-23.amzn2.0.1 will be installed 2022-09-27T15:43:11.9245410Z ---> Package libatomic.x86_64 0:7.3.1-15.amzn2 will be installed 2022-09-27T15:43:11.9258336Z ---> Package libcilkrts.x86_64 0:7.3.1-15.amzn2 will be installed 2022-09-27T15:43:11.9285077Z ---> Package libgfortran.x86_64 0:7.3.1-15.amzn2 will be installed 2022-09-27T15:43:11.9319287Z ---> Package libitm.x86_64 0:7.3.1-15.amzn2 will be installed 2022-09-27T15:43:11.9334774Z ---> Package libmpc.x86_64 0:1.0.1-3.amzn2.0.2 will be installed 2022-09-27T15:43:11.9346831Z ---> Package libmpx.x86_64 0:7.3.1-15.amzn2 will be installed 2022-09-27T15:43:11.9361384Z ---> Package libquadmath.x86_64 0:7.3.1-15.amzn2 will be installed 2022-09-27T15:43:11.9385638Z ---> Package libsanitizer.x86_64 0:7.3.1-15.amzn2 will be installed 2022-09-27T15:43:11.9430290Z ---> Package m4.x86_64 0:1.4.16-10.amzn2.0.2 will be installed 2022-09-27T15:43:11.9445234Z ---> Package mpfr.x86_64 0:3.1.1-4.amzn2.0.2 will be installed 2022-09-27T15:43:11.9465595Z ---> Package neon.x86_64 0:0.30.0-3.amzn2.0.2 will be installed 2022-09-27T15:43:11.9538872Z --> Processing Dependency: libgnutls.so.28(GNUTLS_2_12)(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-09-27T15:43:11.9575561Z --> Processing Dependency: libgnutls.so.28(GNUTLS_1_4)(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-09-27T15:43:11.9576599Z --> Processing Dependency: libproxy.so.1()(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-09-27T15:43:11.9595146Z --> Processing Dependency: libpakchois.so.0()(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-09-27T15:43:11.9612235Z --> Processing Dependency: libgnutls.so.28()(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-09-27T15:43:11.9618259Z ---> Package perl-Data-Dumper.x86_64 0:2.145-3.amzn2.0.2 will be installed 2022-09-27T15:43:11.9666384Z ---> Package perl-Test-Harness.noarch 0:3.28-3.amzn2 will be installed 2022-09-27T15:43:11.9758433Z ---> Package perl-Thread-Queue.noarch 0:3.02-2.amzn2 will be installed 2022-09-27T15:43:11.9770234Z ---> Package perl-XML-Parser.x86_64 0:2.41-10.amzn2.0.2 will be installed 2022-09-27T15:43:11.9784981Z ---> Package perl-srpm-macros.noarch 0:1-8.amzn2.0.1 will be installed 2022-09-27T15:43:11.9786358Z ---> Package subversion-libs.x86_64 0:1.7.14-16.amzn2.0.1 will be installed 2022-09-27T15:43:11.9813925Z ---> Package systemtap-client.x86_64 0:4.5-1.amzn2.0.1 will be installed 2022-09-27T15:43:12.0011594Z --> Processing Dependency: mokutil for package: systemtap-client-4.5-1.amzn2.0.1.x86_64 2022-09-27T15:43:12.0024943Z --> Processing Dependency: libavahi-common.so.3()(64bit) for package: systemtap-client-4.5-1.amzn2.0.1.x86_64 2022-09-27T15:43:12.0050107Z --> Processing Dependency: libavahi-client.so.3()(64bit) for package: systemtap-client-4.5-1.amzn2.0.1.x86_64 2022-09-27T15:43:12.0050673Z ---> Package systemtap-devel.x86_64 0:4.5-1.amzn2.0.1 will be installed 2022-09-27T15:43:12.0168843Z --> Processing Dependency: kernel-devel-uname-r for package: systemtap-devel-4.5-1.amzn2.0.1.x86_64 2022-09-27T15:43:12.1147834Z --> Running transaction check 2022-09-27T15:43:12.1149321Z ---> Package apr-util-bdb.x86_64 0:1.6.1-5.amzn2.0.2 will be installed 2022-09-27T15:43:12.1158965Z ---> Package avahi-libs.x86_64 0:0.6.31-20.amzn2 will be installed 2022-09-27T15:43:12.1183855Z ---> Package gettext-common-devel.noarch 0:0.19.8.1-3.amzn2 will be installed 2022-09-27T15:43:12.1185058Z ---> Package glibc-headers.x86_64 0:2.26-60.amzn2 will be installed 2022-09-27T15:43:12.1254437Z --> Processing Dependency: kernel-headers >= 2.2.1 for package: glibc-headers-2.26-60.amzn2.x86_64 2022-09-27T15:43:12.2303471Z --> Processing Dependency: kernel-headers for package: glibc-headers-2.26-60.amzn2.x86_64 2022-09-27T15:43:12.2304019Z ---> Package gnutls.x86_64 0:3.3.29-9.amzn2.0.1 will be installed 2022-09-27T15:43:12.2367654Z --> Processing Dependency: trousers >= 0.3.11.2 for package: gnutls-3.3.29-9.amzn2.0.1.x86_64 2022-09-27T15:43:12.2393350Z ---> Package kernel-devel.x86_64 0:4.14.291-218.527.amzn2 will be installed 2022-09-27T15:43:12.2419090Z --> Processing Dependency: elfutils-libelf-devel for package: kernel-devel-4.14.291-218.527.amzn2.x86_64 2022-09-27T15:43:12.2438598Z ---> Package libproxy.x86_64 0:0.4.11-10.amzn2.0.3 will be installed 2022-09-27T15:43:12.2465681Z --> Processing Dependency: libmodman.so.1()(64bit) for package: libproxy-0.4.11-10.amzn2.0.3.x86_64 2022-09-27T15:43:12.2483479Z ---> Package mokutil.x86_64 1:0.3.0-10.amzn2.0.1 will be installed 2022-09-27T15:43:12.2530596Z --> Processing Dependency: libefivar.so.1(libefivar.so.0)(64bit) for package: 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 2022-09-27T15:43:12.2551371Z --> Processing Dependency: libefivar.so.1(LIBEFIVAR_0.24)(64bit) for package: 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 2022-09-27T15:43:12.2552617Z --> Processing Dependency: libefivar.so.1()(64bit) for package: 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 2022-09-27T15:43:12.2553493Z ---> Package pakchois.x86_64 0:0.4-10.amzn2.0.2 will be installed 2022-09-27T15:43:12.2565567Z --> Running transaction check 2022-09-27T15:43:12.2566581Z ---> Package efivar-libs.x86_64 0:31-4.amzn2.0.4 will be installed 2022-09-27T15:43:12.2583322Z ---> Package elfutils-libelf-devel.x86_64 0:0.176-2.amzn2 will be installed 2022-09-27T15:43:12.2595145Z --> Processing Dependency: pkgconfig(zlib) for package: elfutils-libelf-devel-0.176-2.amzn2.x86_64 2022-09-27T15:43:12.2618013Z ---> Package kernel-headers.x86_64 0:4.14.291-218.527.amzn2 will be installed 2022-09-27T15:43:12.2619114Z ---> Package libmodman.x86_64 0:2.0.1-8.amzn2.0.2 will be installed 2022-09-27T15:43:12.2636628Z ---> Package trousers.x86_64 0:0.3.14-2.amzn2.0.2 will be installed 2022-09-27T15:43:12.2692230Z --> Running transaction check 2022-09-27T15:43:12.2692722Z ---> Package zlib-devel.x86_64 0:1.2.7-19.amzn2.0.1 will be installed 2022-09-27T15:43:12.5267699Z --> Finished Dependency Resolution 2022-09-27T15:43:12.6018877Z 2022-09-27T15:43:12.6019570Z Dependencies Resolved 2022-09-27T15:43:12.6130231Z 2022-09-27T15:43:12.6130683Z ================================================================================ 2022-09-27T15:43:12.6131064Z Package Arch Version Repository Size 2022-09-27T15:43:12.6133103Z ================================================================================ 2022-09-27T15:43:12.6133474Z Installing for group install "Development Tools": 2022-09-27T15:43:12.6134053Z autoconf noarch 2.69-11.amzn2 amzn2-core 701 k 2022-09-27T15:43:12.6134513Z automake noarch 1.13.4-3.1.amzn2 amzn2-core 679 k 2022-09-27T15:43:12.6134964Z bison x86_64 3.0.4-6.amzn2.0.2 amzn2-core 674 k 2022-09-27T15:43:12.6135401Z byacc x86_64 1.9.20130304-3.amzn2.0.2 amzn2-core 66 k 2022-09-27T15:43:12.6135834Z cscope x86_64 15.8-10.amzn2.0.2 amzn2-core 204 k 2022-09-27T15:43:12.6136260Z ctags x86_64 5.8-13.amzn2.0.2 amzn2-core 157 k 2022-09-27T15:43:12.6137452Z diffstat x86_64 1.57-4.amzn2.0.2 amzn2-core 35 k 2022-09-27T15:43:12.6137868Z doxygen x86_64 1:1.8.5-4.amzn2 amzn2-core 3.5 M 2022-09-27T15:43:12.6138300Z elfutils x86_64 0.176-2.amzn2 amzn2-core 307 k 2022-09-27T15:43:12.6138724Z flex x86_64 2.5.37-3.amzn2.0.3 amzn2-core 291 k 2022-09-27T15:43:12.6139142Z gcc x86_64 7.3.1-15.amzn2 amzn2-core 22 M 2022-09-27T15:43:12.6142047Z gcc-c++ x86_64 7.3.1-15.amzn2 amzn2-core 13 M 2022-09-27T15:43:12.6142533Z gcc-gfortran x86_64 7.3.1-15.amzn2 amzn2-core 11 M 2022-09-27T15:43:12.6143010Z indent x86_64 2.2.11-13.amzn2.0.2 amzn2-core 150 k 2022-09-27T15:43:12.6143456Z intltool noarch 0.50.2-7.amzn2 amzn2-core 59 k 2022-09-27T15:43:12.6143882Z libtool x86_64 2.4.2-22.2.amzn2.0.2 amzn2-core 588 k 2022-09-27T15:43:12.6144324Z patch x86_64 2.7.1-12.amzn2.0.2 amzn2-core 110 k 2022-09-27T15:43:12.6144766Z patchutils x86_64 0.3.3-4.amzn2.0.1 amzn2-core 104 k 2022-09-27T15:43:12.6145202Z rcs x86_64 5.9.0-5.amzn2.0.2 amzn2-core 231 k 2022-09-27T15:43:12.6145621Z rpm-build x86_64 4.11.3-48.amzn2.0.2 amzn2-core 150 k 2022-09-27T15:43:12.6146058Z rpm-sign x86_64 4.11.3-48.amzn2.0.2 amzn2-core 50 k 2022-09-27T15:43:12.6146499Z subversion x86_64 1.7.14-16.amzn2.0.1 amzn2-core 1.0 M 2022-09-27T15:43:12.6146906Z swig x86_64 3.0.12-11.amzn2.0.3 amzn2-core 1.4 M 2022-09-27T15:43:12.6147557Z system-rpm-config noarch 9.1.0-76.amzn2.0.14 amzn2-core 90 k 2022-09-27T15:43:12.6148016Z systemtap x86_64 4.5-1.amzn2.0.1 amzn2-core 12 k 2022-09-27T15:43:12.6148340Z Installing for dependencies: 2022-09-27T15:43:12.6148731Z apr x86_64 1.7.0-9.amzn2 amzn2-core 122 k 2022-09-27T15:43:12.6149156Z apr-util x86_64 1.6.1-5.amzn2.0.2 amzn2-core 99 k 2022-09-27T15:43:12.6149601Z apr-util-bdb x86_64 1.6.1-5.amzn2.0.2 amzn2-core 19 k 2022-09-27T15:43:12.6150030Z avahi-libs x86_64 0.6.31-20.amzn2 amzn2-core 61 k 2022-09-27T15:43:12.6150459Z cpp x86_64 7.3.1-15.amzn2 amzn2-core 9.2 M 2022-09-27T15:43:12.6151383Z dwz x86_64 0.11-3.amzn2.0.3 amzn2-core 98 k 2022-09-27T15:43:12.6152242Z efivar-libs x86_64 31-4.amzn2.0.4 amzn2-core 68 k 2022-09-27T15:43:12.6152696Z elfutils-libelf-devel x86_64 0.176-2.amzn2 amzn2-core 40 k 2022-09-27T15:43:12.6153163Z emacs-filesystem noarch 1:27.2-4.amzn2.0.1 amzn2-core 67 k 2022-09-27T15:43:12.6153613Z gdb x86_64 8.0.1-36.amzn2.0.1 amzn2-core 3.1 M 2022-09-27T15:43:12.6154046Z gettext-common-devel noarch 0.19.8.1-3.amzn2 amzn2-core 410 k 2022-09-27T15:43:12.6154515Z gettext-devel x86_64 0.19.8.1-3.amzn2 amzn2-core 320 k 2022-09-27T15:43:12.6154998Z glibc-devel x86_64 2.26-60.amzn2 amzn2-core 994 k 2022-09-27T15:43:12.6155443Z glibc-headers x86_64 2.26-60.amzn2 amzn2-core 515 k 2022-09-27T15:43:12.6155868Z gnutls x86_64 3.3.29-9.amzn2.0.1 amzn2-core 661 k 2022-09-27T15:43:12.6156309Z go-srpm-macros noarch 3.0.15-23.amzn2.0.1 amzn2-core 23 k 2022-09-27T15:43:12.6156767Z kernel-devel x86_64 4.14.291-218.527.amzn2 amzn2-core 13 M 2022-09-27T15:43:12.6157216Z kernel-headers x86_64 4.14.291-218.527.amzn2 amzn2-core 1.2 M 2022-09-27T15:43:12.6157638Z libatomic x86_64 7.3.1-15.amzn2 amzn2-core 46 k 2022-09-27T15:43:12.6158064Z libcilkrts x86_64 7.3.1-15.amzn2 amzn2-core 85 k 2022-09-27T15:43:12.6158494Z libgfortran x86_64 7.3.1-15.amzn2 amzn2-core 536 k 2022-09-27T15:43:12.6158897Z libitm x86_64 7.3.1-15.amzn2 amzn2-core 85 k 2022-09-27T15:43:12.6159327Z libmodman x86_64 2.0.1-8.amzn2.0.2 amzn2-core 29 k 2022-09-27T15:43:12.6159755Z libmpc x86_64 1.0.1-3.amzn2.0.2 amzn2-core 52 k 2022-09-27T15:43:12.6160184Z libmpx x86_64 7.3.1-15.amzn2 amzn2-core 51 k 2022-09-27T15:43:12.6160595Z libproxy x86_64 0.4.11-10.amzn2.0.3 amzn2-core 61 k 2022-09-27T15:43:12.6161029Z libquadmath x86_64 7.3.1-15.amzn2 amzn2-core 189 k 2022-09-27T15:43:12.6161461Z libsanitizer x86_64 7.3.1-15.amzn2 amzn2-core 642 k 2022-09-27T15:43:12.6161869Z m4 x86_64 1.4.16-10.amzn2.0.2 amzn2-core 256 k 2022-09-27T15:43:12.6162286Z mokutil x86_64 1:0.3.0-10.amzn2.0.1 amzn2-core 39 k 2022-09-27T15:43:12.6162701Z mpfr x86_64 3.1.1-4.amzn2.0.2 amzn2-core 208 k 2022-09-27T15:43:12.6163117Z neon x86_64 0.30.0-3.amzn2.0.2 amzn2-core 166 k 2022-09-27T15:43:12.6163528Z pakchois x86_64 0.4-10.amzn2.0.2 amzn2-core 14 k 2022-09-27T15:43:12.6163978Z perl-Data-Dumper x86_64 2.145-3.amzn2.0.2 amzn2-core 48 k 2022-09-27T15:43:12.6164442Z perl-Test-Harness noarch 3.28-3.amzn2 amzn2-core 302 k 2022-09-27T15:43:12.6164997Z perl-Thread-Queue noarch 3.02-2.amzn2 amzn2-core 17 k 2022-09-27T15:43:12.6165471Z perl-XML-Parser x86_64 2.41-10.amzn2.0.2 amzn2-core 223 k 2022-09-27T15:43:12.6165934Z perl-srpm-macros noarch 1-8.amzn2.0.1 amzn2-core 4.7 k 2022-09-27T15:43:12.6166396Z subversion-libs x86_64 1.7.14-16.amzn2.0.1 amzn2-core 912 k 2022-09-27T15:43:12.6166832Z systemtap-client x86_64 4.5-1.amzn2.0.1 amzn2-core 3.7 M 2022-09-27T15:43:12.6167292Z systemtap-devel x86_64 4.5-1.amzn2.0.1 amzn2-core 2.3 M 2022-09-27T15:43:12.6167736Z trousers x86_64 0.3.14-2.amzn2.0.2 amzn2-core 294 k 2022-09-27T15:43:12.6168150Z zlib-devel x86_64 1.2.7-19.amzn2.0.1 amzn2-core 50 k 2022-09-27T15:43:12.6168413Z 2022-09-27T15:43:12.6168535Z Transaction Summary 2022-09-27T15:43:12.6168826Z ================================================================================ 2022-09-27T15:43:12.6169146Z Install 25 Packages (+43 Dependent packages) 2022-09-27T15:43:12.6169324Z 2022-09-27T15:43:12.6169446Z Total download size: 96 M 2022-09-27T15:43:12.6169710Z Installed size: 303 M 2022-09-27T15:43:12.6169971Z Downloading packages: 2022-09-27T15:43:12.6179903Z Delta RPMs disabled because /usr/bin/applydeltarpm not installed. 2022-09-27T15:43:14.0181249Z -------------------------------------------------------------------------------- 2022-09-27T15:43:14.0182073Z Total 68 MB/s | 96 MB 00:01 2022-09-27T15:43:14.1239841Z Running transaction check 2022-09-27T15:43:14.2010387Z Running transaction test 2022-09-27T15:43:16.5793776Z Transaction test succeeded 2022-09-27T15:43:16.5798112Z Running transaction 2022-09-27T15:43:21.7673111Z Installing : mpfr-3.1.1-4.amzn2.0.2.x86_64 1/68 2022-09-27T15:43:24.2595176Z Installing : libmpc-1.0.1-3.amzn2.0.2.x86_64 2/68 2022-09-27T15:43:26.7251284Z Installing : m4-1.4.16-10.amzn2.0.2.x86_64 3/68 2022-09-27T15:43:29.1565361Z Installing : apr-1.7.0-9.amzn2.x86_64 4/68 2022-09-27T15:43:31.5783208Z Installing : apr-util-bdb-1.6.1-5.amzn2.0.2.x86_64 5/68 2022-09-27T15:43:34.0486401Z Installing : apr-util-1.6.1-5.amzn2.0.2.x86_64 6/68 2022-09-27T15:43:36.5332961Z Installing : avahi-libs-0.6.31-20.amzn2.x86_64 7/68 2022-09-27T15:43:39.0004550Z Installing : libquadmath-7.3.1-15.amzn2.x86_64 8/68 2022-09-27T15:43:39.8695254Z Installing : patch-2.7.1-12.amzn2.0.2.x86_64 9/68 2022-09-27T15:43:39.9507274Z Installing : perl-Thread-Queue-3.02-2.amzn2.noarch 10/68 2022-09-27T15:43:41.0091409Z Installing : libgfortran-7.3.1-15.amzn2.x86_64 11/68 2022-09-27T15:43:41.0458680Z Installing : cpp-7.3.1-15.amzn2.x86_64 12/68 2022-09-27T15:43:41.0675781Z Installing : zlib-devel-1.2.7-19.amzn2.0.1.x86_64 13/68 2022-09-27T15:43:41.0917867Z Installing : elfutils-libelf-devel-0.176-2.amzn2.x86_64 14/68 2022-09-27T15:43:41.1254538Z Installing : libmodman-2.0.1-8.amzn2.0.2.x86_64 15/68 2022-09-27T15:43:41.1833807Z Installing : libproxy-0.4.11-10.amzn2.0.3.x86_64 16/68 2022-09-27T15:43:41.2422261Z Installing : perl-XML-Parser-2.41-10.amzn2.0.2.x86_64 17/68 2022-09-27T15:43:41.3553922Z Installing : elfutils-0.176-2.amzn2.x86_64 18/68 2022-09-27T15:43:41.3875264Z Installing : libsanitizer-7.3.1-15.amzn2.x86_64 19/68 2022-09-27T15:43:41.4104752Z Installing : 1:emacs-filesystem-27.2-4.amzn2.0.1.noarch 20/68 2022-09-27T15:43:41.4409873Z Installing : efivar-libs-31-4.amzn2.0.4.x86_64 21/68 2022-09-27T15:43:41.4728928Z Installing : 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 22/68 2022-09-27T15:43:41.5574923Z Installing : dwz-0.11-3.amzn2.0.3.x86_64 23/68 2022-09-27T15:43:41.7257369Z Installing : trousers-0.3.14-2.amzn2.0.2.x86_64 24/68 2022-09-27T15:43:42.0167586Z Installing : gnutls-3.3.29-9.amzn2.0.1.x86_64 25/68 2022-09-27T15:43:42.1845939Z Installing : kernel-headers-4.14.291-218.527.amzn2.x86_64 26/68 2022-09-27T15:43:42.3152367Z Installing : glibc-headers-2.26-60.amzn2.x86_64 27/68 2022-09-27T15:43:42.3504222Z Installing : glibc-devel-2.26-60.amzn2.x86_64 28/68 2022-09-27T15:43:42.7426031Z Installing : libitm-7.3.1-15.amzn2.x86_64 29/68 2022-09-27T15:43:42.7722649Z Installing : gdb-8.0.1-36.amzn2.0.1.x86_64 30/68 2022-09-27T15:43:42.7973535Z Installing : libmpx-7.3.1-15.amzn2.x86_64 31/68 2022-09-27T15:43:42.8252649Z Installing : perl-srpm-macros-1-8.amzn2.0.1.noarch 32/68 2022-09-27T15:43:42.8497397Z Installing : system-rpm-config-9.1.0-76.amzn2.0.14.noarch 33/68 2022-09-27T15:43:42.8719264Z Installing : go-srpm-macros-3.0.15-23.amzn2.0.1.noarch 34/68 2022-09-27T15:43:42.9640489Z Installing : perl-Data-Dumper-2.145-3.amzn2.0.2.x86_64 35/68 2022-09-27T15:43:43.0090566Z Installing : autoconf-2.69-11.amzn2.noarch 36/68 2022-09-27T15:43:43.0838829Z Installing : gettext-common-devel-0.19.8.1-3.amzn2.noarch 37/68 2022-09-27T15:43:43.1732098Z Installing : gettext-devel-0.19.8.1-3.amzn2.x86_64 38/68 2022-09-27T15:43:43.2759820Z Installing : perl-Test-Harness-3.28-3.amzn2.noarch 39/68 2022-09-27T15:43:43.3123636Z Installing : automake-1.13.4-3.1.amzn2.noarch 40/68 2022-09-27T15:43:43.3450027Z Installing : libatomic-7.3.1-15.amzn2.x86_64 41/68 2022-09-27T15:43:45.3813247Z Installing : libcilkrts-7.3.1-15.amzn2.x86_64 42/68 2022-09-27T15:43:49.1764218Z Installing : gcc-7.3.1-15.amzn2.x86_64 43/68 2022-09-27T15:44:00.2829016Z Installing : kernel-devel-4.14.291-218.527.amzn2.x86_64 44/68 2022-09-27T15:44:00.8857341Z Installing : systemtap-devel-4.5-1.amzn2.0.1.x86_64 45/68 2022-09-27T15:44:00.9405612Z Installing : systemtap-client-4.5-1.amzn2.0.1.x86_64 46/68 2022-09-27T15:44:00.9950768Z Installing : pakchois-0.4-10.amzn2.0.2.x86_64 47/68 2022-09-27T15:44:01.1271357Z Installing : neon-0.30.0-3.amzn2.0.2.x86_64 48/68 2022-09-27T15:44:01.3015136Z Installing : subversion-libs-1.7.14-16.amzn2.0.1.x86_64 49/68 2022-09-27T15:44:01.4006488Z Installing : subversion-1.7.14-16.amzn2.0.1.x86_64 50/68 2022-09-27T15:44:02.6155815Z Installing : systemtap-4.5-1.amzn2.0.1.x86_64 51/68 2022-09-27T15:44:04.2224136Z Installing : gcc-gfortran-7.3.1-15.amzn2.x86_64 52/68 2022-09-27T15:44:04.3397976Z Installing : gcc-c++-7.3.1-15.amzn2.x86_64 53/68 2022-09-27T15:44:04.3777627Z Installing : libtool-2.4.2-22.2.amzn2.0.2.x86_64 54/68 2022-09-27T15:44:04.4151597Z Installing : intltool-0.50.2-7.amzn2.noarch 55/68 2022-09-27T15:44:04.4685855Z Installing : rpm-build-4.11.3-48.amzn2.0.2.x86_64 56/68 2022-09-27T15:44:04.5260211Z Installing : cscope-15.8-10.amzn2.0.2.x86_64 57/68 2022-09-27T15:44:04.6289734Z Installing : flex-2.5.37-3.amzn2.0.3.x86_64 58/68 2022-09-27T15:44:04.6914766Z Installing : bison-3.0.4-6.amzn2.0.2.x86_64 59/68 2022-09-27T15:44:04.7360577Z Installing : rcs-5.9.0-5.amzn2.0.2.x86_64 60/68 2022-09-27T15:44:04.7714142Z Installing : ctags-5.8-13.amzn2.0.2.x86_64 61/68 2022-09-27T15:44:04.8139643Z Installing : indent-2.2.11-13.amzn2.0.2.x86_64 62/68 2022-09-27T15:44:05.4846843Z Installing : patchutils-0.3.3-4.amzn2.0.1.x86_64 63/68 2022-09-27T15:44:05.5292066Z Installing : 1:doxygen-1.8.5-4.amzn2.x86_64 64/68 2022-09-27T15:44:05.5546432Z Installing : diffstat-1.57-4.amzn2.0.2.x86_64 65/68 2022-09-27T15:44:05.8621877Z Installing : byacc-1.9.20130304-3.amzn2.0.2.x86_64 66/68 2022-09-27T15:44:05.9035291Z Installing : swig-3.0.12-11.amzn2.0.3.x86_64 67/68 2022-09-27T15:44:05.9672126Z Installing : rpm-sign-4.11.3-48.amzn2.0.2.x86_64 68/68 2022-09-27T15:44:05.9792872Z Verifying : elfutils-libelf-devel-0.176-2.amzn2.x86_64 1/68 2022-09-27T15:44:05.9904661Z Verifying : perl-Thread-Queue-3.02-2.amzn2.noarch 2/68 2022-09-27T15:44:05.9994106Z Verifying : gettext-devel-0.19.8.1-3.amzn2.x86_64 3/68 2022-09-27T15:44:06.0094063Z Verifying : patch-2.7.1-12.amzn2.0.2.x86_64 4/68 2022-09-27T15:44:06.0182490Z Verifying : flex-2.5.37-3.amzn2.0.3.x86_64 5/68 2022-09-27T15:44:06.0269854Z Verifying : glibc-headers-2.26-60.amzn2.x86_64 6/68 2022-09-27T15:44:06.0369695Z Verifying : pakchois-0.4-10.amzn2.0.2.x86_64 7/68 2022-09-27T15:44:06.0460022Z Verifying : rpm-sign-4.11.3-48.amzn2.0.2.x86_64 8/68 2022-09-27T15:44:06.0544538Z Verifying : gcc-gfortran-7.3.1-15.amzn2.x86_64 9/68 2022-09-27T15:44:06.0635285Z Verifying : swig-3.0.12-11.amzn2.0.3.x86_64 10/68 2022-09-27T15:44:06.0739228Z Verifying : byacc-1.9.20130304-3.amzn2.0.2.x86_64 11/68 2022-09-27T15:44:06.0832755Z Verifying : libmpc-1.0.1-3.amzn2.0.2.x86_64 12/68 2022-09-27T15:44:06.0920803Z Verifying : libcilkrts-7.3.1-15.amzn2.x86_64 13/68 2022-09-27T15:44:06.1012473Z Verifying : go-srpm-macros-3.0.15-23.amzn2.0.1.noarch 14/68 2022-09-27T15:44:06.1098610Z Verifying : libproxy-0.4.11-10.amzn2.0.3.x86_64 15/68 2022-09-27T15:44:06.1182975Z Verifying : cscope-15.8-10.amzn2.0.2.x86_64 16/68 2022-09-27T15:44:06.1268108Z Verifying : diffstat-1.57-4.amzn2.0.2.x86_64 17/68 2022-09-27T15:44:06.1360588Z Verifying : 1:doxygen-1.8.5-4.amzn2.x86_64 18/68 2022-09-27T15:44:06.1453807Z Verifying : gcc-c++-7.3.1-15.amzn2.x86_64 19/68 2022-09-27T15:44:06.1539394Z Verifying : libatomic-7.3.1-15.amzn2.x86_64 20/68 2022-09-27T15:44:06.1623871Z Verifying : system-rpm-config-9.1.0-76.amzn2.0.14.noarch 21/68 2022-09-27T15:44:06.1714462Z Verifying : systemtap-devel-4.5-1.amzn2.0.1.x86_64 22/68 2022-09-27T15:44:06.1795536Z Verifying : perl-Test-Harness-3.28-3.amzn2.noarch 23/68 2022-09-27T15:44:06.1879006Z Verifying : autoconf-2.69-11.amzn2.noarch 24/68 2022-09-27T15:44:06.1977288Z Verifying : libquadmath-7.3.1-15.amzn2.x86_64 25/68 2022-09-27T15:44:06.2064949Z Verifying : intltool-0.50.2-7.amzn2.noarch 26/68 2022-09-27T15:44:06.2151402Z Verifying : apr-util-1.6.1-5.amzn2.0.2.x86_64 27/68 2022-09-27T15:44:06.2238307Z Verifying : glibc-devel-2.26-60.amzn2.x86_64 28/68 2022-09-27T15:44:06.2384769Z Verifying : kernel-devel-4.14.291-218.527.amzn2.x86_64 29/68 2022-09-27T15:44:06.2470165Z Verifying : cpp-7.3.1-15.amzn2.x86_64 30/68 2022-09-27T15:44:06.2568281Z Verifying : rpm-build-4.11.3-48.amzn2.0.2.x86_64 31/68 2022-09-27T15:44:06.2654644Z Verifying : gettext-common-devel-0.19.8.1-3.amzn2.noarch 32/68 2022-09-27T15:44:06.2737248Z Verifying : perl-Data-Dumper-2.145-3.amzn2.0.2.x86_64 33/68 2022-09-27T15:44:06.2817417Z Verifying : perl-srpm-macros-1-8.amzn2.0.1.noarch 34/68 2022-09-27T15:44:06.2905871Z Verifying : gnutls-3.3.29-9.amzn2.0.1.x86_64 35/68 2022-09-27T15:44:06.2994868Z Verifying : subversion-libs-1.7.14-16.amzn2.0.1.x86_64 36/68 2022-09-27T15:44:06.3093351Z Verifying : automake-1.13.4-3.1.amzn2.noarch 37/68 2022-09-27T15:44:06.3177879Z Verifying : apr-util-bdb-1.6.1-5.amzn2.0.2.x86_64 38/68 2022-09-27T15:44:06.3262853Z Verifying : libmpx-7.3.1-15.amzn2.x86_64 39/68 2022-09-27T15:44:06.3357022Z Verifying : avahi-libs-0.6.31-20.amzn2.x86_64 40/68 2022-09-27T15:44:06.3451567Z Verifying : bison-3.0.4-6.amzn2.0.2.x86_64 41/68 2022-09-27T15:44:06.3535244Z Verifying : libgfortran-7.3.1-15.amzn2.x86_64 42/68 2022-09-27T15:44:06.3639076Z Verifying : gdb-8.0.1-36.amzn2.0.1.x86_64 43/68 2022-09-27T15:44:06.3719656Z Verifying : patchutils-0.3.3-4.amzn2.0.1.x86_64 44/68 2022-09-27T15:44:06.3814944Z Verifying : libitm-7.3.1-15.amzn2.x86_64 45/68 2022-09-27T15:44:06.3899723Z Verifying : libtool-2.4.2-22.2.amzn2.0.2.x86_64 46/68 2022-09-27T15:44:06.3989915Z Verifying : gcc-7.3.1-15.amzn2.x86_64 47/68 2022-09-27T15:44:06.4074011Z Verifying : indent-2.2.11-13.amzn2.0.2.x86_64 48/68 2022-09-27T15:44:06.4169932Z Verifying : kernel-headers-4.14.291-218.527.amzn2.x86_64 49/68 2022-09-27T15:44:06.4255558Z Verifying : subversion-1.7.14-16.amzn2.0.1.x86_64 50/68 2022-09-27T15:44:06.4341267Z Verifying : apr-1.7.0-9.amzn2.x86_64 51/68 2022-09-27T15:44:06.4426545Z Verifying : ctags-5.8-13.amzn2.0.2.x86_64 52/68 2022-09-27T15:44:06.4507145Z Verifying : 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 53/68 2022-09-27T15:44:06.4586244Z Verifying : mpfr-3.1.1-4.amzn2.0.2.x86_64 54/68 2022-09-27T15:44:06.4672716Z Verifying : trousers-0.3.14-2.amzn2.0.2.x86_64 55/68 2022-09-27T15:44:06.4752587Z Verifying : neon-0.30.0-3.amzn2.0.2.x86_64 56/68 2022-09-27T15:44:06.4839199Z Verifying : systemtap-4.5-1.amzn2.0.1.x86_64 57/68 2022-09-27T15:44:06.4932084Z Verifying : dwz-0.11-3.amzn2.0.3.x86_64 58/68 2022-09-27T15:44:06.5030464Z Verifying : systemtap-client-4.5-1.amzn2.0.1.x86_64 59/68 2022-09-27T15:44:06.5115404Z Verifying : efivar-libs-31-4.amzn2.0.4.x86_64 60/68 2022-09-27T15:44:06.5199728Z Verifying : rcs-5.9.0-5.amzn2.0.2.x86_64 61/68 2022-09-27T15:44:06.5288583Z Verifying : 1:emacs-filesystem-27.2-4.amzn2.0.1.noarch 62/68 2022-09-27T15:44:06.5373107Z Verifying : libsanitizer-7.3.1-15.amzn2.x86_64 63/68 2022-09-27T15:44:06.5458009Z Verifying : elfutils-0.176-2.amzn2.x86_64 64/68 2022-09-27T15:44:06.5539053Z Verifying : m4-1.4.16-10.amzn2.0.2.x86_64 65/68 2022-09-27T15:44:06.5619865Z Verifying : perl-XML-Parser-2.41-10.amzn2.0.2.x86_64 66/68 2022-09-27T15:44:06.5700535Z Verifying : libmodman-2.0.1-8.amzn2.0.2.x86_64 67/68 2022-09-27T15:44:06.6417486Z Verifying : zlib-devel-1.2.7-19.amzn2.0.1.x86_64 68/68 2022-09-27T15:44:06.6417726Z 2022-09-27T15:44:06.6421263Z Installed: 2022-09-27T15:44:06.6422148Z autoconf.noarch 0:2.69-11.amzn2 2022-09-27T15:44:06.6422618Z automake.noarch 0:1.13.4-3.1.amzn2 2022-09-27T15:44:06.6423046Z bison.x86_64 0:3.0.4-6.amzn2.0.2 2022-09-27T15:44:06.6423637Z byacc.x86_64 0:1.9.20130304-3.amzn2.0.2 2022-09-27T15:44:06.6424066Z cscope.x86_64 0:15.8-10.amzn2.0.2 2022-09-27T15:44:06.6426121Z ctags.x86_64 0:5.8-13.amzn2.0.2 2022-09-27T15:44:06.6426588Z diffstat.x86_64 0:1.57-4.amzn2.0.2 2022-09-27T15:44:06.6427024Z doxygen.x86_64 1:1.8.5-4.amzn2 2022-09-27T15:44:06.6427448Z elfutils.x86_64 0:0.176-2.amzn2 2022-09-27T15:44:06.6427849Z flex.x86_64 0:2.5.37-3.amzn2.0.3 2022-09-27T15:44:06.6428266Z gcc.x86_64 0:7.3.1-15.amzn2 2022-09-27T15:44:06.6430089Z gcc-c++.x86_64 0:7.3.1-15.amzn2 2022-09-27T15:44:06.6430563Z gcc-gfortran.x86_64 0:7.3.1-15.amzn2 2022-09-27T15:44:06.6431324Z indent.x86_64 0:2.2.11-13.amzn2.0.2 2022-09-27T15:44:06.6431768Z intltool.noarch 0:0.50.2-7.amzn2 2022-09-27T15:44:06.6432182Z libtool.x86_64 0:2.4.2-22.2.amzn2.0.2 2022-09-27T15:44:06.6432606Z patch.x86_64 0:2.7.1-12.amzn2.0.2 2022-09-27T15:44:06.6433034Z patchutils.x86_64 0:0.3.3-4.amzn2.0.1 2022-09-27T15:44:06.6433452Z rcs.x86_64 0:5.9.0-5.amzn2.0.2 2022-09-27T15:44:06.6433846Z rpm-build.x86_64 0:4.11.3-48.amzn2.0.2 2022-09-27T15:44:06.6434269Z rpm-sign.x86_64 0:4.11.3-48.amzn2.0.2 2022-09-27T15:44:06.6434702Z subversion.x86_64 0:1.7.14-16.amzn2.0.1 2022-09-27T15:44:06.6435110Z swig.x86_64 0:3.0.12-11.amzn2.0.3 2022-09-27T15:44:06.6435553Z system-rpm-config.noarch 0:9.1.0-76.amzn2.0.14 2022-09-27T15:44:06.6436004Z systemtap.x86_64 0:4.5-1.amzn2.0.1 2022-09-27T15:44:06.6436229Z 2022-09-27T15:44:06.6436354Z Dependency Installed: 2022-09-27T15:44:06.6436725Z apr.x86_64 0:1.7.0-9.amzn2 2022-09-27T15:44:06.6437139Z apr-util.x86_64 0:1.6.1-5.amzn2.0.2 2022-09-27T15:44:06.6437566Z apr-util-bdb.x86_64 0:1.6.1-5.amzn2.0.2 2022-09-27T15:44:06.6437977Z avahi-libs.x86_64 0:0.6.31-20.amzn2 2022-09-27T15:44:06.6438399Z cpp.x86_64 0:7.3.1-15.amzn2 2022-09-27T15:44:06.6438801Z dwz.x86_64 0:0.11-3.amzn2.0.3 2022-09-27T15:44:06.6439243Z efivar-libs.x86_64 0:31-4.amzn2.0.4 2022-09-27T15:44:06.6439880Z elfutils-libelf-devel.x86_64 0:0.176-2.amzn2 2022-09-27T15:44:06.6440331Z emacs-filesystem.noarch 1:27.2-4.amzn2.0.1 2022-09-27T15:44:06.6440767Z gdb.x86_64 0:8.0.1-36.amzn2.0.1 2022-09-27T15:44:06.6441208Z gettext-common-devel.noarch 0:0.19.8.1-3.amzn2 2022-09-27T15:44:06.6441646Z gettext-devel.x86_64 0:0.19.8.1-3.amzn2 2022-09-27T15:44:06.6442074Z glibc-devel.x86_64 0:2.26-60.amzn2 2022-09-27T15:44:06.6442500Z glibc-headers.x86_64 0:2.26-60.amzn2 2022-09-27T15:44:06.6442901Z gnutls.x86_64 0:3.3.29-9.amzn2.0.1 2022-09-27T15:44:06.6443415Z go-srpm-macros.noarch 0:3.0.15-23.amzn2.0.1 2022-09-27T15:44:06.6443874Z kernel-devel.x86_64 0:4.14.291-218.527.amzn2 2022-09-27T15:44:06.6444313Z kernel-headers.x86_64 0:4.14.291-218.527.amzn2 2022-09-27T15:44:06.6444719Z libatomic.x86_64 0:7.3.1-15.amzn2 2022-09-27T15:44:06.6445134Z libcilkrts.x86_64 0:7.3.1-15.amzn2 2022-09-27T15:44:06.6445551Z libgfortran.x86_64 0:7.3.1-15.amzn2 2022-09-27T15:44:06.6445945Z libitm.x86_64 0:7.3.1-15.amzn2 2022-09-27T15:44:06.6446356Z libmodman.x86_64 0:2.0.1-8.amzn2.0.2 2022-09-27T15:44:06.6446769Z libmpc.x86_64 0:1.0.1-3.amzn2.0.2 2022-09-27T15:44:06.6447187Z libmpx.x86_64 0:7.3.1-15.amzn2 2022-09-27T15:44:06.6447588Z libproxy.x86_64 0:0.4.11-10.amzn2.0.3 2022-09-27T15:44:06.6448012Z libquadmath.x86_64 0:7.3.1-15.amzn2 2022-09-27T15:44:06.6448436Z libsanitizer.x86_64 0:7.3.1-15.amzn2 2022-09-27T15:44:06.6448830Z m4.x86_64 0:1.4.16-10.amzn2.0.2 2022-09-27T15:44:06.6449236Z mokutil.x86_64 1:0.3.0-10.amzn2.0.1 2022-09-27T15:44:06.6449640Z mpfr.x86_64 0:3.1.1-4.amzn2.0.2 2022-09-27T15:44:06.6450046Z neon.x86_64 0:0.30.0-3.amzn2.0.2 2022-09-27T15:44:06.6450440Z pakchois.x86_64 0:0.4-10.amzn2.0.2 2022-09-27T15:44:06.6450880Z perl-Data-Dumper.x86_64 0:2.145-3.amzn2.0.2 2022-09-27T15:44:06.6451336Z perl-Test-Harness.noarch 0:3.28-3.amzn2 2022-09-27T15:44:06.6451781Z perl-Thread-Queue.noarch 0:3.02-2.amzn2 2022-09-27T15:44:06.6452247Z perl-XML-Parser.x86_64 0:2.41-10.amzn2.0.2 2022-09-27T15:44:06.6452708Z perl-srpm-macros.noarch 0:1-8.amzn2.0.1 2022-09-27T15:44:06.6453165Z subversion-libs.x86_64 0:1.7.14-16.amzn2.0.1 2022-09-27T15:44:06.6453591Z systemtap-client.x86_64 0:4.5-1.amzn2.0.1 2022-09-27T15:44:06.6454035Z systemtap-devel.x86_64 0:4.5-1.amzn2.0.1 2022-09-27T15:44:06.6454472Z trousers.x86_64 0:0.3.14-2.amzn2.0.2 2022-09-27T15:44:06.6454873Z zlib-devel.x86_64 0:1.2.7-19.amzn2.0.1 2022-09-27T15:44:06.6455077Z 2022-09-27T15:44:06.6455185Z Complete! 2022-09-27T15:44:06.6794817Z ++ uname -r 2022-09-27T15:44:06.6800559Z + sudo yum install -y 'kernel-devel-uname-r == 4.14.252-195.483.amzn2.x86_64' 2022-09-27T15:44:07.1858849Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-09-27T15:44:07.1968084Z Existing lock /var/run/yum.pid: another copy is running as pid 33177. 2022-09-27T15:44:07.1968928Z Another app is currently holding the yum lock; waiting for it to exit... 2022-09-27T15:44:07.1976424Z The other application is: yum 2022-09-27T15:44:07.1976981Z Memory : 87 M RSS (304 MB VSZ) 2022-09-27T15:44:07.1977742Z Started: Tue Sep 27 15:44:05 2022 - 00:02 ago 2022-09-27T15:44:07.1978313Z State : Running, pid: 33177 2022-09-27T15:44:09.2003613Z Another app is currently holding the yum lock; waiting for it to exit... 2022-09-27T15:44:09.2009859Z The other application is: yum 2022-09-27T15:44:09.2010440Z Memory : 163 M RSS (382 MB VSZ) 2022-09-27T15:44:09.2011582Z Started: Tue Sep 27 15:44:05 2022 - 00:04 ago 2022-09-27T15:44:09.2012189Z State : Running, pid: 33177 2022-09-27T15:44:11.4690997Z Resolving Dependencies 2022-09-27T15:44:11.4696977Z --> Running transaction check 2022-09-27T15:44:11.4697460Z ---> Package kernel-devel.x86_64 0:4.14.252-195.483.amzn2 will be installed 2022-09-27T15:44:11.7536867Z --> Finished Dependency Resolution 2022-09-27T15:44:11.8336479Z 2022-09-27T15:44:11.8337028Z Dependencies Resolved 2022-09-27T15:44:11.8342694Z 2022-09-27T15:44:11.8343058Z ================================================================================ 2022-09-27T15:44:11.8343437Z Package Arch Version Repository Size 2022-09-27T15:44:11.8343780Z ================================================================================ 2022-09-27T15:44:11.8344045Z Installing: 2022-09-27T15:44:11.8344532Z kernel-devel x86_64 4.14.252-195.483.amzn2 amzn2-core 13 M 2022-09-27T15:44:11.8344773Z 2022-09-27T15:44:11.8344891Z Transaction Summary 2022-09-27T15:44:11.8345177Z ================================================================================ 2022-09-27T15:44:11.8345440Z Install 1 Package 2022-09-27T15:44:11.8345598Z 2022-09-27T15:44:11.8345726Z Total download size: 13 M 2022-09-27T15:44:11.8345987Z Installed size: 60 M 2022-09-27T15:44:11.8346375Z Downloading packages: 2022-09-27T15:44:11.8355592Z Delta RPMs disabled because /usr/bin/applydeltarpm not installed. 2022-09-27T15:44:12.1285883Z Running transaction check 2022-09-27T15:44:12.1467025Z Running transaction test 2022-09-27T15:44:12.5478454Z Transaction test succeeded 2022-09-27T15:44:12.5481790Z Running transaction 2022-09-27T15:44:27.5070106Z Installing : kernel-devel-4.14.252-195.483.amzn2.x86_64 1/1 2022-09-27T15:44:27.5851451Z Verifying : kernel-devel-4.14.252-195.483.amzn2.x86_64 1/1 2022-09-27T15:44:27.5851763Z 2022-09-27T15:44:27.5851878Z Installed: 2022-09-27T15:44:27.5852292Z kernel-devel.x86_64 0:4.14.252-195.483.amzn2 2022-09-27T15:44:27.5852512Z 2022-09-27T15:44:27.5852608Z Complete! 2022-09-27T15:44:27.6158819Z + sudo modprobe backlight 2022-09-27T15:44:27.6333881Z + sudo curl -fsL -o /tmp/nvidia_driver https://s3.amazonaws.com/ossci-linux/nvidia_driver/NVIDIA-Linux-x86_64-515.57.run 2022-09-27T15:44:31.3017041Z + sudo /bin/bash /tmp/nvidia_driver -s --no-drm 2022-09-27T15:44:32.6979414Z Verifying archive integrity... OK 2022-09-27T15:44:59.3597124Z Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86_64 515.57................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ 2022-09-27T15:44:59.5009493Z 2022-09-27T15:44:59.5010186Z WARNING: The nvidia-drm module will not be installed. As a result, DRM-KMS will not function with this installation of the NVIDIA driver. 2022-09-27T15:44:59.5012816Z 2022-09-27T15:45:15.0554096Z 2022-09-27T15:45:15.0555549Z WARNING: nvidia-installer was forced to guess the X library path '/usr/lib64' and X module path '/usr/lib64/xorg/modules'; these paths were not queryable from the system. If X fails to find the NVIDIA X driver module, please install the `pkg-config` utility and the X.Org SDK/development package for your distribution and reinstall the driver. 2022-09-27T15:45:15.0557108Z 2022-09-27T15:45:23.3719276Z + sudo rm -fv /tmp/nvidia_driver 2022-09-27T15:45:23.4512642Z removed ‘/tmp/nvidia_driver’ 2022-09-27T15:45:23.4526724Z + nvidia-smi 2022-09-27T15:45:28.0194715Z Tue Sep 27 15:45:28 2022 2022-09-27T15:45:28.0195342Z +-----------------------------------------------------------------------------+ 2022-09-27T15:45:28.0195868Z | NVIDIA-SMI 515.57 Driver Version: 515.57 CUDA Version: 11.7 | 2022-09-27T15:45:28.0196358Z |-------------------------------+----------------------+----------------------+ 2022-09-27T15:45:28.0196857Z | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | 2022-09-27T15:45:28.0198451Z | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | 2022-09-27T15:45:28.0198867Z | | | MIG M. | 2022-09-27T15:45:28.0199179Z |===============================+======================+======================| 2022-09-27T15:45:28.0240816Z | 0 Tesla M60 Off | 00000000:00:1D.0 Off | 3305405958 | 2022-09-27T15:45:28.0241164Z | N/A 32C P0 37W / 150W | 0MiB / 7680MiB | 0% Default | 2022-09-27T15:45:28.0241498Z | | | N/A | 2022-09-27T15:45:28.0241977Z +-------------------------------+----------------------+----------------------+ 2022-09-27T15:45:28.0286903Z | 1 Tesla M60 Off | 00000000:00:1E.0 Off | 0 | 2022-09-27T15:45:28.0287238Z | N/A 26C P0 38W / 150W | 0MiB / 7680MiB | 99% Default | 2022-09-27T15:45:28.0287560Z | | | N/A | 2022-09-27T15:45:28.0288044Z +-------------------------------+----------------------+----------------------+ 2022-09-27T15:45:28.0288411Z 2022-09-27T15:45:28.0288831Z +-----------------------------------------------------------------------------+ 2022-09-27T15:45:28.0289203Z | Processes: | 2022-09-27T15:45:28.0289546Z | GPU GI CI PID Type Process name GPU Memory | 2022-09-27T15:45:28.0289872Z | ID ID Usage | 2022-09-27T15:45:28.0290180Z |=============================================================================| 2022-09-27T15:45:28.0292784Z | No running processes found | 2022-09-27T15:45:28.0293286Z +-----------------------------------------------------------------------------+ 2022-09-27T15:45:28.5588799Z == Installing nvidia container toolkit for amzn2 == 2022-09-27T15:45:28.5592858Z + sudo yum install -y yum-utils 2022-09-27T15:45:29.1062146Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-09-27T15:45:29.3711671Z Package yum-utils-1.1.31-46.amzn2.0.1.noarch already installed and latest version 2022-09-27T15:45:29.3712330Z Nothing to do 2022-09-27T15:45:29.3906560Z + sudo yum-config-manager --add-repo https://nvidia.github.io/nvidia-docker/amzn2/nvidia-docker.repo 2022-09-27T15:45:29.9499842Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-09-27T15:45:29.9778355Z adding repo from: https://nvidia.github.io/nvidia-docker/amzn2/nvidia-docker.repo 2022-09-27T15:45:29.9779058Z grabbing file https://nvidia.github.io/nvidia-docker/amzn2/nvidia-docker.repo to /etc/yum.repos.d/nvidia-docker.repo 2022-09-27T15:45:29.9779578Z repo saved to /etc/yum.repos.d/nvidia-docker.repo 2022-09-27T15:45:29.9922287Z + sudo yum install -y nvidia-docker2 2022-09-27T15:45:30.5315190Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-09-27T15:45:30.5763991Z Retrieving key from https://nvidia.github.io/libnvidia-container/gpgkey 2022-09-27T15:45:30.5869512Z Importing GPG key 0xF796ECB0: 2022-09-27T15:45:30.5869934Z Userid : "NVIDIA CORPORATION (Open Source Projects) " 2022-09-27T15:45:30.5870312Z Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0 2022-09-27T15:45:30.5871113Z From : https://nvidia.github.io/libnvidia-container/gpgkey 2022-09-27T15:45:30.9780761Z Retrieving key from https://nvidia.github.io/nvidia-container-runtime/gpgkey 2022-09-27T15:45:30.9871652Z Importing GPG key 0xF796ECB0: 2022-09-27T15:45:30.9872331Z Userid : "NVIDIA CORPORATION (Open Source Projects) " 2022-09-27T15:45:30.9872749Z Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0 2022-09-27T15:45:30.9873306Z From : https://nvidia.github.io/nvidia-container-runtime/gpgkey 2022-09-27T15:45:31.2133549Z Retrieving key from https://nvidia.github.io/nvidia-docker/gpgkey 2022-09-27T15:45:31.2371862Z Importing GPG key 0xF796ECB0: 2022-09-27T15:45:31.2372269Z Userid : "NVIDIA CORPORATION (Open Source Projects) " 2022-09-27T15:45:31.2372679Z Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0 2022-09-27T15:45:31.2373141Z From : https://nvidia.github.io/nvidia-docker/gpgkey 2022-09-27T15:45:32.8947688Z Resolving Dependencies 2022-09-27T15:45:32.8954792Z --> Running transaction check 2022-09-27T15:45:32.8955614Z ---> Package nvidia-docker2.noarch 0:2.11.0-1 will be installed 2022-09-27T15:45:32.8980842Z --> Processing Dependency: nvidia-container-toolkit >= 1.10.0-1 for package: nvidia-docker2-2.11.0-1.noarch 2022-09-27T15:45:32.9346798Z --> Running transaction check 2022-09-27T15:45:32.9347278Z ---> Package nvidia-container-toolkit.x86_64 0:1.11.0-1 will be installed 2022-09-27T15:45:32.9500678Z --> Processing Dependency: nvidia-container-toolkit-base = 1.11.0-1 for package: nvidia-container-toolkit-1.11.0-1.x86_64 2022-09-27T15:45:32.9510750Z --> Processing Dependency: libnvidia-container-tools < 2.0.0 for package: nvidia-container-toolkit-1.11.0-1.x86_64 2022-09-27T15:45:32.9637163Z --> Processing Dependency: libnvidia-container-tools >= 1.11.0-1 for package: nvidia-container-toolkit-1.11.0-1.x86_64 2022-09-27T15:45:32.9637678Z --> Running transaction check 2022-09-27T15:45:32.9638613Z ---> Package libnvidia-container-tools.x86_64 0:1.11.0-1 will be installed 2022-09-27T15:45:32.9648562Z --> Processing Dependency: libnvidia-container1(x86-64) >= 1.11.0-1 for package: libnvidia-container-tools-1.11.0-1.x86_64 2022-09-27T15:45:32.9674628Z --> Processing Dependency: libnvidia-container.so.1(NVC_1.0)(64bit) for package: libnvidia-container-tools-1.11.0-1.x86_64 2022-09-27T15:45:32.9675360Z --> Processing Dependency: libnvidia-container.so.1()(64bit) for package: libnvidia-container-tools-1.11.0-1.x86_64 2022-09-27T15:45:32.9676120Z ---> Package nvidia-container-toolkit-base.x86_64 0:1.11.0-1 will be installed 2022-09-27T15:45:32.9678769Z --> Running transaction check 2022-09-27T15:45:32.9679228Z ---> Package libnvidia-container1.x86_64 0:1.11.0-1 will be installed 2022-09-27T15:45:33.2610895Z --> Finished Dependency Resolution 2022-09-27T15:45:33.3341632Z 2022-09-27T15:45:33.3341837Z Dependencies Resolved 2022-09-27T15:45:33.3355472Z 2022-09-27T15:45:33.3355855Z ================================================================================ 2022-09-27T15:45:33.3356232Z Package Arch Version Repository Size 2022-09-27T15:45:33.3356582Z ================================================================================ 2022-09-27T15:45:33.3356828Z Installing: 2022-09-27T15:45:33.3357294Z nvidia-docker2 noarch 2.11.0-1 libnvidia-container 8.7 k 2022-09-27T15:45:33.3357656Z Installing for dependencies: 2022-09-27T15:45:33.3358119Z libnvidia-container-tools x86_64 1.11.0-1 libnvidia-container 49 k 2022-09-27T15:45:33.3358770Z libnvidia-container1 x86_64 1.11.0-1 libnvidia-container 1.0 M 2022-09-27T15:45:33.3359291Z nvidia-container-toolkit x86_64 1.11.0-1 libnvidia-container 780 k 2022-09-27T15:45:33.3359838Z nvidia-container-toolkit-base x86_64 1.11.0-1 libnvidia-container 2.5 M 2022-09-27T15:45:33.3360106Z 2022-09-27T15:45:33.3360205Z Transaction Summary 2022-09-27T15:45:33.3360497Z ================================================================================ 2022-09-27T15:45:33.3360811Z Install 1 Package (+4 Dependent packages) 2022-09-27T15:45:33.3361004Z 2022-09-27T15:45:33.3361132Z Total download size: 4.3 M 2022-09-27T15:45:33.3361382Z Installed size: 12 M 2022-09-27T15:45:33.3361645Z Downloading packages: 2022-09-27T15:45:33.4477973Z -------------------------------------------------------------------------------- 2022-09-27T15:45:33.4478758Z Total 39 MB/s | 4.3 MB 00:00 2022-09-27T15:45:33.4522964Z Running transaction check 2022-09-27T15:45:33.4694558Z Running transaction test 2022-09-27T15:45:33.4857475Z Transaction test succeeded 2022-09-27T15:45:33.4860857Z Running transaction 2022-09-27T15:45:33.9794727Z Installing : nvidia-container-toolkit-base-1.11.0-1.x86_64 1/5 2022-09-27T15:45:34.0353028Z Installing : libnvidia-container1-1.11.0-1.x86_64 2/5 2022-09-27T15:45:34.1780510Z Installing : libnvidia-container-tools-1.11.0-1.x86_64 3/5 2022-09-27T15:45:34.2027112Z Installing : nvidia-container-toolkit-1.11.0-1.x86_64 4/5 2022-09-27T15:45:34.2390178Z Installing : nvidia-docker2-2.11.0-1.noarch 5/5 2022-09-27T15:45:34.2503510Z Verifying : libnvidia-container1-1.11.0-1.x86_64 1/5 2022-09-27T15:45:34.2593916Z Verifying : nvidia-container-toolkit-base-1.11.0-1.x86_64 2/5 2022-09-27T15:45:34.2696223Z Verifying : nvidia-container-toolkit-1.11.0-1.x86_64 3/5 2022-09-27T15:45:34.2789014Z Verifying : libnvidia-container-tools-1.11.0-1.x86_64 4/5 2022-09-27T15:45:34.3551818Z Verifying : nvidia-docker2-2.11.0-1.noarch 5/5 2022-09-27T15:45:34.3552113Z 2022-09-27T15:45:34.3552202Z Installed: 2022-09-27T15:45:34.3552602Z nvidia-docker2.noarch 0:2.11.0-1 2022-09-27T15:45:34.3552821Z 2022-09-27T15:45:34.3552965Z Dependency Installed: 2022-09-27T15:45:34.3553400Z libnvidia-container-tools.x86_64 0:1.11.0-1 2022-09-27T15:45:34.3553851Z libnvidia-container1.x86_64 0:1.11.0-1 2022-09-27T15:45:34.3554313Z nvidia-container-toolkit.x86_64 0:1.11.0-1 2022-09-27T15:45:34.3554798Z nvidia-container-toolkit-base.x86_64 0:1.11.0-1 2022-09-27T15:45:34.3555043Z 2022-09-27T15:45:34.3555155Z Complete! 2022-09-27T15:45:34.4505354Z + sudo systemctl restart docker 2022-09-27T15:45:42.1958830Z + echo 'GPU_FLAG=--gpus all' 2022-09-27T15:45:42.7643185Z Command completed after 1 attempt(s). 2022-09-27T15:45:42.7643405Z 2022-09-27T15:45:42.7695879Z ##[group]Run python3 -m pip install psutil==5.9.1 2022-09-27T15:45:42.7696271Z python3 -m pip install psutil==5.9.1 2022-09-27T15:45:42.7696595Z python3 -m pip install pynvml==11.4.1 2022-09-27T15:45:42.7696931Z python3 -m tools.stats.monitor > usage_log.txt 2>&1 & 2022-09-27T15:45:42.7697310Z echo "::set-output name=monitor-script-pid::${!}" 2022-09-27T15:45:42.7710960Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-09-27T15:45:42.7711260Z env: 2022-09-27T15:45:42.7711501Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:45:42.7711769Z GPU_FLAG: --gpus all 2022-09-27T15:45:42.7712001Z ##[endgroup] 2022-09-27T15:45:43.5167674Z Defaulting to user installation because normal site-packages is not writeable 2022-09-27T15:45:43.8766933Z Collecting psutil==5.9.1 2022-09-27T15:45:43.8950678Z Downloading psutil-5.9.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (281 kB) 2022-09-27T15:45:43.9624955Z Installing collected packages: psutil 2022-09-27T15:45:44.1190629Z Successfully installed psutil-5.9.1 2022-09-27T15:45:44.5818220Z Defaulting to user installation because normal site-packages is not writeable 2022-09-27T15:45:44.6652417Z Collecting pynvml==11.4.1 2022-09-27T15:45:44.6818754Z Downloading pynvml-11.4.1-py3-none-any.whl (46 kB) 2022-09-27T15:45:44.7306758Z Installing collected packages: pynvml 2022-09-27T15:45:44.7838335Z Successfully installed pynvml-11.4.1 2022-09-27T15:45:44.8354152Z Prepare all required actions 2022-09-27T15:45:44.8354524Z Getting action download info 2022-09-27T15:45:45.0078200Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:ada9688bc02703b63dc0e606da280613803449a5) 2022-09-27T15:45:45.2903858Z Download action repository 'actions/download-artifact@v2' (SHA:f023be2c48cc18debc3bacd34cb396e0295e2869) 2022-09-27T15:45:45.4191257Z ##[group]Run ./.github/actions/download-build-artifacts 2022-09-27T15:45:45.4191551Z with: 2022-09-27T15:45:45.4191832Z name: linux-bionic-cuda11.6-py3.10-gcc7 2022-09-27T15:45:45.4192120Z env: 2022-09-27T15:45:45.4192336Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:45:45.4192600Z GPU_FLAG: --gpus all 2022-09-27T15:45:45.4192844Z ##[endgroup] 2022-09-27T15:45:45.4225056Z ##[group]Run seemethere/download-artifact-s3@v4 2022-09-27T15:45:45.4225348Z with: 2022-09-27T15:45:45.4225612Z name: linux-bionic-cuda11.6-py3.10-gcc7 2022-09-27T15:45:45.4225922Z s3-bucket: gha-artifacts 2022-09-27T15:45:45.4226251Z region: us-east-1 2022-09-27T15:45:45.4226469Z env: 2022-09-27T15:45:45.4226711Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:45:45.4226979Z GPU_FLAG: --gpus all 2022-09-27T15:45:45.4227210Z ##[endgroup] 2022-09-27T15:45:45.9105303Z Found 1 objects with prefix pytorch/pytorch/3133193930/linux-bionic-cuda11.6-py3.10-gcc7/ 2022-09-27T15:45:45.9105911Z Starting download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2022-09-27T15:45:52.0498176Z Finished download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2022-09-27T15:45:52.0498508Z 2022-09-27T15:45:52.0499483Z Artifact download has finished successfully 2022-09-27T15:45:52.0634684Z ##[group]Run unzip -o artifacts.zip 2022-09-27T15:45:52.0635003Z unzip -o artifacts.zip 2022-09-27T15:45:52.0648415Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-09-27T15:45:52.0648729Z env: 2022-09-27T15:45:52.0648978Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:45:52.0649247Z GPU_FLAG: --gpus all 2022-09-27T15:45:52.0649506Z ##[endgroup] 2022-09-27T15:45:52.0716873Z Archive: artifacts.zip 2022-09-27T15:45:52.0718751Z creating: dist/ 2022-09-27T15:45:54.0997193Z inflating: dist/torch-1.13.0a0+git52424e2-cp310-cp310-linux_x86_64.whl 2022-09-27T15:45:54.0997635Z creating: build/custom_test_artifacts/ 2022-09-27T15:45:54.0998042Z creating: build/custom_test_artifacts/custom-op-build/ 2022-09-27T15:45:54.0998514Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2022-09-27T15:45:54.1005372Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeOutput.log 2022-09-27T15:45:54.1005887Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/ 2022-09-27T15:45:54.1006443Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeSystem.cmake 2022-09-27T15:45:54.1007004Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/ 2022-09-27T15:45:54.1007563Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/tmp/ 2022-09-27T15:45:54.1009993Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/CMakeCCompilerId.c 2022-09-27T15:45:54.1011177Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/a.out 2022-09-27T15:45:54.1011736Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/ 2022-09-27T15:45:54.1012301Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/tmp/ 2022-09-27T15:45:54.1015110Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/CMakeCXXCompilerId.cpp 2022-09-27T15:45:54.1016227Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/a.out 2022-09-27T15:45:54.1018278Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_C.bin 2022-09-27T15:45:54.1019029Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeCCompiler.cmake 2022-09-27T15:45:54.1020467Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CXX.bin 2022-09-27T15:45:54.1021752Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeCXXCompiler.cmake 2022-09-27T15:45:54.1022375Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/ 2022-09-27T15:45:54.1022978Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/ 2022-09-27T15:45:54.1077414Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2022-09-27T15:45:54.1078124Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2022-09-27T15:45:54.1078855Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2022-09-27T15:45:54.1079592Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2022-09-27T15:45:54.1080318Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2022-09-27T15:45:54.1080990Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2022-09-27T15:45:54.1081693Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2022-09-27T15:45:54.1082393Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2022-09-27T15:45:54.1083374Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2022-09-27T15:45:54.1125682Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2022-09-27T15:45:54.1166880Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2022-09-27T15:45:54.1167921Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2022-09-27T15:45:54.1168667Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2022-09-27T15:45:54.1169429Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.reg.c 2022-09-27T15:45:54.1170174Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin 2022-09-27T15:45:54.1171263Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2022-09-27T15:45:54.1172284Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.o 2022-09-27T15:45:54.1174300Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/CMakeCUDACompilerId.cu 2022-09-27T15:45:54.1247276Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/a.out 2022-09-27T15:45:54.1320259Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CUDA.bin 2022-09-27T15:45:54.1320893Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeCUDACompiler.cmake 2022-09-27T15:45:54.1321453Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2022-09-27T15:45:54.1322201Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeError.log 2022-09-27T15:45:54.1323071Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2022-09-27T15:45:54.1323847Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2022-09-27T15:45:54.1324474Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2022-09-27T15:45:54.1325100Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2022-09-27T15:45:54.1325712Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2022-09-27T15:45:54.1326275Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2022-09-27T15:45:54.1326879Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2022-09-27T15:45:54.1327927Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2022-09-27T15:45:54.1328529Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2022-09-27T15:45:54.1329181Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2022-09-27T15:45:54.1329811Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2022-09-27T15:45:54.1350920Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2022-09-27T15:45:54.1464434Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2022-09-27T15:45:54.1464992Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2022-09-27T15:45:54.1465600Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2022-09-27T15:45:54.1466239Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2022-09-27T15:45:54.1466851Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2022-09-27T15:45:54.1467431Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2022-09-27T15:45:54.1468043Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2022-09-27T15:45:54.1468969Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2022-09-27T15:45:54.1469834Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2022-09-27T15:45:54.1470487Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2022-09-27T15:45:54.1471520Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2022-09-27T15:45:54.1492216Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2022-09-27T15:45:54.1572828Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2022-09-27T15:45:54.1573490Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2022-09-27T15:45:54.1574112Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2022-09-27T15:45:54.1574680Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2022-09-27T15:45:54.1575435Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2022-09-27T15:45:54.1576662Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2022-09-27T15:45:54.1577350Z inflating: build/custom_test_artifacts/custom-op-build/detect_cuda_version.cc 2022-09-27T15:45:54.1580386Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2022-09-27T15:45:54.1580946Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2022-09-27T15:45:54.1581895Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2022-09-27T15:45:54.1674242Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2022-09-27T15:45:54.1735292Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2022-09-27T15:45:54.1735779Z creating: build/custom_test_artifacts/jit-hook-build/ 2022-09-27T15:45:54.1736246Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2022-09-27T15:45:54.1742816Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeOutput.log 2022-09-27T15:45:54.1743376Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/ 2022-09-27T15:45:54.1743927Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeSystem.cmake 2022-09-27T15:45:54.1744481Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/ 2022-09-27T15:45:54.1745009Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/tmp/ 2022-09-27T15:45:54.1747143Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/CMakeCCompilerId.c 2022-09-27T15:45:54.1748272Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/a.out 2022-09-27T15:45:54.1748829Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/ 2022-09-27T15:45:54.1749361Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/tmp/ 2022-09-27T15:45:54.1752280Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/CMakeCXXCompilerId.cpp 2022-09-27T15:45:54.1753378Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/a.out 2022-09-27T15:45:54.1755274Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_C.bin 2022-09-27T15:45:54.1755870Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeCCompiler.cmake 2022-09-27T15:45:54.1757897Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CXX.bin 2022-09-27T15:45:54.1758702Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeCXXCompiler.cmake 2022-09-27T15:45:54.1759364Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/ 2022-09-27T15:45:54.1760057Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/ 2022-09-27T15:45:54.1813862Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2022-09-27T15:45:54.1814829Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2022-09-27T15:45:54.1815624Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2022-09-27T15:45:54.1816508Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2022-09-27T15:45:54.1817266Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2022-09-27T15:45:54.1818047Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2022-09-27T15:45:54.1818814Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2022-09-27T15:45:54.1819584Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2022-09-27T15:45:54.1820404Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2022-09-27T15:45:54.1862132Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2022-09-27T15:45:54.1903362Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2022-09-27T15:45:54.1904366Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2022-09-27T15:45:54.1905774Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2022-09-27T15:45:54.1906435Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.reg.c 2022-09-27T15:45:54.1907309Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin 2022-09-27T15:45:54.1908217Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2022-09-27T15:45:54.1909317Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.o 2022-09-27T15:45:54.1911185Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/CMakeCUDACompilerId.cu 2022-09-27T15:45:54.1984117Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/a.out 2022-09-27T15:45:54.2056993Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CUDA.bin 2022-09-27T15:45:54.2057773Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeCUDACompiler.cmake 2022-09-27T15:45:54.2058351Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2022-09-27T15:45:54.2059282Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeError.log 2022-09-27T15:45:54.2059932Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2022-09-27T15:45:54.2060508Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2022-09-27T15:45:54.2061265Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2022-09-27T15:45:54.2062006Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2022-09-27T15:45:54.2062687Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2022-09-27T15:45:54.2063285Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2022-09-27T15:45:54.2063946Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2022-09-27T15:45:54.2064751Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2022-09-27T15:45:54.2065425Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2022-09-27T15:45:54.2066099Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2022-09-27T15:45:54.2066716Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2022-09-27T15:45:54.2087650Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2022-09-27T15:45:54.2150664Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2022-09-27T15:45:54.2151647Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2022-09-27T15:45:54.2152268Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2022-09-27T15:45:54.2152972Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2022-09-27T15:45:54.2153587Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2022-09-27T15:45:54.2154622Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2022-09-27T15:45:54.2155176Z inflating: build/custom_test_artifacts/jit-hook-build/detect_cuda_version.cc 2022-09-27T15:45:54.2158373Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2022-09-27T15:45:54.2159013Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2022-09-27T15:45:54.2159815Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2022-09-27T15:45:54.2209027Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2022-09-27T15:45:54.2209689Z creating: build/custom_test_artifacts/custom-backend-build/ 2022-09-27T15:45:54.2248287Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2022-09-27T15:45:54.2248991Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeOutput.log 2022-09-27T15:45:54.2249638Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/ 2022-09-27T15:45:54.2250237Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeSystem.cmake 2022-09-27T15:45:54.2250850Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/ 2022-09-27T15:45:54.2251427Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/tmp/ 2022-09-27T15:45:54.2252067Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/CMakeCCompilerId.c 2022-09-27T15:45:54.2252687Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/a.out 2022-09-27T15:45:54.2253262Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/ 2022-09-27T15:45:54.2253862Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/tmp/ 2022-09-27T15:45:54.2254528Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/CMakeCXXCompilerId.cpp 2022-09-27T15:45:54.2255171Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/a.out 2022-09-27T15:45:54.2255868Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_C.bin 2022-09-27T15:45:54.2256536Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeCCompiler.cmake 2022-09-27T15:45:54.2257198Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CXX.bin 2022-09-27T15:45:54.2258038Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeCXXCompiler.cmake 2022-09-27T15:45:54.2258631Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/ 2022-09-27T15:45:54.2259411Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/ 2022-09-27T15:45:54.2289506Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2022-09-27T15:45:54.2290306Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2022-09-27T15:45:54.2291327Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2022-09-27T15:45:54.2292549Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2022-09-27T15:45:54.2293422Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2022-09-27T15:45:54.2294469Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2022-09-27T15:45:54.2295727Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2022-09-27T15:45:54.2296498Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2022-09-27T15:45:54.2297384Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2022-09-27T15:45:54.2339599Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2022-09-27T15:45:54.2380779Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2022-09-27T15:45:54.2381806Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2022-09-27T15:45:54.2382657Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2022-09-27T15:45:54.2383297Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.reg.c 2022-09-27T15:45:54.2384113Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin 2022-09-27T15:45:54.2385013Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2022-09-27T15:45:54.2386014Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.o 2022-09-27T15:45:54.2388059Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/CMakeCUDACompilerId.cu 2022-09-27T15:45:54.2461433Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/a.out 2022-09-27T15:45:54.2534760Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CUDA.bin 2022-09-27T15:45:54.2535417Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeCUDACompiler.cmake 2022-09-27T15:45:54.2535982Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2022-09-27T15:45:54.2537192Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeError.log 2022-09-27T15:45:54.2537975Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2022-09-27T15:45:54.2538571Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2022-09-27T15:45:54.2539174Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2022-09-27T15:45:54.2539836Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2022-09-27T15:45:54.2540642Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2022-09-27T15:45:54.2541261Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2022-09-27T15:45:54.2541896Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2022-09-27T15:45:54.2543098Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2022-09-27T15:45:54.2543741Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2022-09-27T15:45:54.2544495Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2022-09-27T15:45:54.2545144Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2022-09-27T15:45:54.2550000Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2022-09-27T15:45:54.2697063Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2022-09-27T15:45:54.2697695Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2022-09-27T15:45:54.2698464Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2022-09-27T15:45:54.2699164Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2022-09-27T15:45:54.2699797Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2022-09-27T15:45:54.2700428Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2022-09-27T15:45:54.2701075Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2022-09-27T15:45:54.2702081Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2022-09-27T15:45:54.2702720Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2022-09-27T15:45:54.2703369Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2022-09-27T15:45:54.2704013Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2022-09-27T15:45:54.2725268Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2022-09-27T15:45:54.2782445Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2022-09-27T15:45:54.2783139Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2022-09-27T15:45:54.2784314Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2022-09-27T15:45:54.2784980Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2022-09-27T15:45:54.2785528Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2022-09-27T15:45:54.2786338Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2022-09-27T15:45:54.2786900Z inflating: build/custom_test_artifacts/custom-backend-build/detect_cuda_version.cc 2022-09-27T15:45:54.2789913Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2022-09-27T15:45:54.2790597Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2022-09-27T15:45:54.2791841Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2022-09-27T15:45:54.2910465Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2022-09-27T15:45:54.2955757Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2022-09-27T15:45:54.2956101Z creating: build/lib/ 2022-09-27T15:45:54.2956776Z inflating: build/lib/libclog.a 2022-09-27T15:45:54.3022978Z inflating: build/lib/libgtest.a 2022-09-27T15:45:54.3033596Z inflating: build/lib/libpthreadpool.a 2022-09-27T15:45:54.3124871Z inflating: build/lib/libbenchmark.a 2022-09-27T15:45:54.3231007Z inflating: build/lib/libprotobuf-lite.a 2022-09-27T15:45:54.3240513Z inflating: build/lib/libittnotify.a 2022-09-27T15:45:54.3272405Z inflating: build/lib/libtensorpipe_uv.a 2022-09-27T15:45:54.3805051Z inflating: build/lib/libprotobuf.a 2022-09-27T15:45:54.3882420Z inflating: build/lib/libasmjit.a 2022-09-27T15:45:54.4015157Z inflating: build/lib/libgloo.a 2022-09-27T15:45:54.4034898Z inflating: build/lib/libfmt.a 2022-09-27T15:45:54.4036850Z inflating: build/lib/libcaffe2_nvrtc.so 2022-09-27T15:45:54.4037391Z inflating: build/lib/libfoxi_loader.a 2022-09-27T15:45:54.4110183Z inflating: build/lib/libc10.so 2022-09-27T15:45:54.4111467Z inflating: build/lib/libtorch_global_deps.so 2022-09-27T15:45:54.4121628Z inflating: build/lib/libcpuinfo.a 2022-09-27T15:45:54.4130506Z inflating: build/lib/libcpuinfo_internals.a 2022-09-27T15:45:54.4146419Z inflating: build/lib/libqnnpack.a 2022-09-27T15:45:54.4170424Z inflating: build/lib/libpytorch_qnnpack.a 2022-09-27T15:45:54.4739388Z inflating: build/lib/libprotoc.a 2022-09-27T15:45:54.4742044Z inflating: build/lib/libnnpack_reference_layers.a 2022-09-27T15:45:54.4764574Z inflating: build/lib/libnnpack.a 2022-09-27T15:45:54.4783540Z inflating: build/lib/libgmock.a 2022-09-27T15:45:54.4784193Z inflating: build/lib/libgtest_main.a 2022-09-27T15:45:54.4785106Z inflating: build/lib/libbenchmark_main.a 2022-09-27T15:45:55.2889324Z inflating: build/lib/libdnnl.a 2022-09-27T15:45:55.3542855Z inflating: build/lib/libtensorpipe.a 2022-09-27T15:45:55.3684173Z inflating: build/lib/libXNNPACK.a 2022-09-27T15:45:55.3729197Z inflating: build/lib/libc10_cuda.so 2022-09-27T15:45:55.3729934Z inflating: build/lib/libgmock_main.a 2022-09-27T15:45:55.5254170Z inflating: build/lib/libfbgemm.a 2022-09-27T15:45:55.5543262Z inflating: build/lib/libtensorpipe_cuda.a 2022-09-27T15:45:55.6671357Z inflating: build/lib/libdnnl_graph.a 2022-09-27T15:45:55.7092537Z inflating: build/lib/libkineto.a 2022-09-27T15:45:55.7137892Z inflating: build/lib/libcaffe2_protos.a 2022-09-27T15:45:55.7185656Z inflating: build/lib/libonnx_proto.a 2022-09-27T15:45:55.7863049Z inflating: build/lib/libonnx.a 2022-09-27T15:45:55.8290978Z inflating: build/lib/libgloo_cuda.a 2022-09-27T15:45:58.1264387Z inflating: build/lib/libtorch_cpu.so 2022-09-27T15:45:58.4686991Z inflating: build/lib/libtorch_cuda_cpp.so 2022-09-27T15:46:00.1633198Z inflating: build/lib/libtorch_cuda_cu.so 2022-09-27T15:46:00.1633983Z inflating: build/lib/libtorch_cuda.so 2022-09-27T15:46:00.1635647Z inflating: build/lib/libtorch.so 2022-09-27T15:46:00.1639169Z inflating: build/lib/libc10d_cuda_test.so 2022-09-27T15:46:01.1509303Z inflating: build/lib/libtorch_cuda_linalg.so 2022-09-27T15:46:01.1533471Z inflating: build/lib/libjitbackend_test.so 2022-09-27T15:46:01.1593666Z inflating: build/lib/libtorchbind_test.so 2022-09-27T15:46:01.1624038Z inflating: build/lib/libbackend_with_compiler.so 2022-09-27T15:46:01.1628767Z inflating: build/lib/libshm.so 2022-09-27T15:46:01.3384419Z inflating: build/lib/libtorch_python.so 2022-09-27T15:46:01.3423691Z inflating: build/lib/libnnapi_backend.so 2022-09-27T15:46:01.3424219Z creating: build/bin/ 2022-09-27T15:46:01.3476596Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2022-09-27T15:46:01.3531693Z inflating: build/bin/c10_DeviceGuard_test 2022-09-27T15:46:01.3585157Z inflating: build/bin/c10_Device_test 2022-09-27T15:46:01.3648334Z inflating: build/bin/c10_DispatchKeySet_test 2022-09-27T15:46:01.3699483Z inflating: build/bin/c10_StreamGuard_test 2022-09-27T15:46:01.3753193Z inflating: build/bin/c10_SymInt_test 2022-09-27T15:46:01.3812705Z inflating: build/bin/c10_InlineDeviceGuard_test 2022-09-27T15:46:01.3873343Z inflating: build/bin/c10_InlineStreamGuard_test 2022-09-27T15:46:01.3934059Z inflating: build/bin/c10_SizesAndStrides_test 2022-09-27T15:46:01.3985384Z inflating: build/bin/c10_Array_test 2022-09-27T15:46:01.4042108Z inflating: build/bin/c10_Bitset_test 2022-09-27T15:46:01.4097033Z inflating: build/bin/c10_C++17_test 2022-09-27T15:46:01.4148230Z inflating: build/bin/c10_ConstexprCrc_test 2022-09-27T15:46:01.4200758Z inflating: build/bin/c10_DeadlockDetection_test 2022-09-27T15:46:01.4253752Z inflating: build/bin/c10_Half_test 2022-09-27T15:46:01.4314319Z inflating: build/bin/c10_LeftRight_test 2022-09-27T15:46:01.4381219Z inflating: build/bin/c10_Metaprogramming_test 2022-09-27T15:46:01.4536957Z inflating: build/bin/c10_SmallVectorTest 2022-09-27T15:46:01.4590525Z inflating: build/bin/c10_Synchronized_test 2022-09-27T15:46:01.4651843Z inflating: build/bin/c10_ThreadLocal_test 2022-09-27T15:46:01.4707878Z inflating: build/bin/c10_TypeIndex_test 2022-09-27T15:46:01.4761942Z inflating: build/bin/c10_TypeList_test 2022-09-27T15:46:01.4813315Z inflating: build/bin/c10_TypeTraits_test 2022-09-27T15:46:01.4868235Z inflating: build/bin/c10_accumulate_test 2022-09-27T15:46:01.4928041Z inflating: build/bin/c10_bfloat16_test 2022-09-27T15:46:01.4985547Z inflating: build/bin/c10_complex_math_test 2022-09-27T15:46:01.5044971Z inflating: build/bin/c10_complex_test 2022-09-27T15:46:01.5162775Z inflating: build/bin/c10_either_test 2022-09-27T15:46:01.5218518Z inflating: build/bin/c10_exception_test 2022-09-27T15:46:01.5272033Z inflating: build/bin/c10_flags_test 2022-09-27T15:46:01.5454401Z inflating: build/bin/c10_intrusive_ptr_test 2022-09-27T15:46:01.5507963Z inflating: build/bin/c10_irange_test 2022-09-27T15:46:01.5569708Z inflating: build/bin/c10_logging_test 2022-09-27T15:46:01.5649574Z inflating: build/bin/c10_optional_test 2022-09-27T15:46:01.5715878Z inflating: build/bin/c10_ordered_preserving_dict_test 2022-09-27T15:46:01.5773963Z inflating: build/bin/c10_registry_test 2022-09-27T15:46:01.5838055Z inflating: build/bin/c10_string_view_test 2022-09-27T15:46:01.5898121Z inflating: build/bin/c10_intrusive_ptr_benchmark 2022-09-27T15:46:01.5953284Z inflating: build/bin/c10_tempfile_test 2022-09-27T15:46:01.6013610Z inflating: build/bin/c10_typeid_test 2022-09-27T15:46:01.6534932Z inflating: build/bin/protoc-3.13.0.0 2022-09-27T15:46:01.7056282Z inflating: build/bin/protoc 2022-09-27T15:46:01.7108040Z inflating: build/bin/c10_cuda_CUDATest 2022-09-27T15:46:01.7425389Z inflating: build/bin/vec_test_all_types_DEFAULT 2022-09-27T15:46:01.7779659Z inflating: build/bin/vec_test_all_types_AVX2 2022-09-27T15:46:01.7844456Z inflating: build/bin/TCPStoreTest 2022-09-27T15:46:01.7901706Z inflating: build/bin/FileStoreTest 2022-09-27T15:46:01.7959073Z inflating: build/bin/HashStoreTest 2022-09-27T15:46:01.7975075Z inflating: build/bin/ProcessGroupMPITest 2022-09-27T15:46:01.7978055Z inflating: build/bin/example_allreduce 2022-09-27T15:46:01.8034366Z inflating: build/bin/Dimname_test 2022-09-27T15:46:01.8112531Z inflating: build/bin/Dict_test 2022-09-27T15:46:01.8180603Z inflating: build/bin/MaybeOwned_test 2022-09-27T15:46:01.8241725Z inflating: build/bin/NamedTensor_test 2022-09-27T15:46:01.8305198Z inflating: build/bin/apply_utils_test 2022-09-27T15:46:01.8368234Z inflating: build/bin/atest 2022-09-27T15:46:01.8433006Z inflating: build/bin/basic 2022-09-27T15:46:01.8490354Z inflating: build/bin/broadcast_test 2022-09-27T15:46:01.8552987Z inflating: build/bin/cpu_generator_test 2022-09-27T15:46:01.8609232Z inflating: build/bin/cpu_profiling_allocator_test 2022-09-27T15:46:01.8662363Z inflating: build/bin/dispatch_key_set_test 2022-09-27T15:46:01.8757162Z inflating: build/bin/cpu_rng_test 2022-09-27T15:46:01.8810032Z inflating: build/bin/dlconvertor_test 2022-09-27T15:46:01.8871941Z inflating: build/bin/extension_backend_test 2022-09-27T15:46:01.8931670Z inflating: build/bin/half_test 2022-09-27T15:46:01.8983859Z inflating: build/bin/lazy_tensor_test 2022-09-27T15:46:01.9084605Z inflating: build/bin/ivalue_test 2022-09-27T15:46:01.9140360Z inflating: build/bin/memory_format_test 2022-09-27T15:46:01.9196713Z inflating: build/bin/math_kernel_test 2022-09-27T15:46:01.9252482Z inflating: build/bin/memory_overlapping_test 2022-09-27T15:46:01.9306616Z inflating: build/bin/operator_name_test 2022-09-27T15:46:01.9366578Z inflating: build/bin/native_test 2022-09-27T15:46:01.9422286Z inflating: build/bin/mobile_memory_cleanup 2022-09-27T15:46:01.9475722Z inflating: build/bin/operators_test 2022-09-27T15:46:01.9531796Z inflating: build/bin/packedtensoraccessor_test 2022-09-27T15:46:01.9601003Z inflating: build/bin/pow_test 2022-09-27T15:46:01.9662281Z inflating: build/bin/quantized_test 2022-09-27T15:46:01.9716717Z inflating: build/bin/reportMemoryUsage_test 2022-09-27T15:46:01.9769358Z inflating: build/bin/reduce_ops_test 2022-09-27T15:46:01.9829462Z inflating: build/bin/scalar_tensor_test 2022-09-27T15:46:01.9890721Z inflating: build/bin/scalar_test 2022-09-27T15:46:01.9945724Z inflating: build/bin/stride_properties_test 2022-09-27T15:46:02.0029304Z inflating: build/bin/tensor_iterator_test 2022-09-27T15:46:02.0088985Z inflating: build/bin/type_ptr_test 2022-09-27T15:46:02.0091611Z inflating: build/bin/thread_init_test 2022-09-27T15:46:02.0150874Z inflating: build/bin/test_parallel 2022-09-27T15:46:02.0206452Z inflating: build/bin/undefined_tensor_test 2022-09-27T15:46:02.0271299Z inflating: build/bin/type_test 2022-09-27T15:46:02.0323954Z inflating: build/bin/variant_test 2022-09-27T15:46:02.0325366Z inflating: build/bin/verify_api_visibility 2022-09-27T15:46:02.0399556Z inflating: build/bin/vmap_test 2022-09-27T15:46:02.0453679Z inflating: build/bin/weakref_test 2022-09-27T15:46:02.0507850Z inflating: build/bin/wrapdim_test 2022-09-27T15:46:02.0571997Z inflating: build/bin/IListRef_test 2022-09-27T15:46:02.0623941Z inflating: build/bin/xla_tensor_test 2022-09-27T15:46:02.0742044Z inflating: build/bin/List_test 2022-09-27T15:46:02.0873590Z inflating: build/bin/kernel_function_legacy_test 2022-09-27T15:46:02.0942332Z inflating: build/bin/KernelFunction_test 2022-09-27T15:46:02.1045842Z inflating: build/bin/kernel_function_test 2022-09-27T15:46:02.1184422Z inflating: build/bin/kernel_lambda_legacy_test 2022-09-27T15:46:02.1296750Z inflating: build/bin/kernel_lambda_test 2022-09-27T15:46:02.1360903Z inflating: build/bin/kernel_stackbased_test 2022-09-27T15:46:02.1414606Z inflating: build/bin/CppSignature_test 2022-09-27T15:46:02.1517871Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2022-09-27T15:46:02.1568897Z inflating: build/bin/op_allowlist_test 2022-09-27T15:46:02.1625104Z inflating: build/bin/inline_container_test 2022-09-27T15:46:02.1938637Z inflating: build/bin/op_registration_test 2022-09-27T15:46:02.1999170Z inflating: build/bin/backend_fallback_test 2022-09-27T15:46:02.2054697Z inflating: build/bin/cuda_apply_test 2022-09-27T15:46:02.2120681Z inflating: build/bin/cuda_atomic_ops_test 2022-09-27T15:46:02.2177676Z inflating: build/bin/cuda_caching_host_allocator_test 2022-09-27T15:46:02.2230908Z inflating: build/bin/cuda_device_test 2022-09-27T15:46:02.2304023Z inflating: build/bin/cuda_complex_math_test 2022-09-27T15:46:02.2367012Z inflating: build/bin/cuda_complex_test 2022-09-27T15:46:02.2431063Z inflating: build/bin/cuda_cub_test 2022-09-27T15:46:02.2484324Z inflating: build/bin/cuda_dlconvertor_test 2022-09-27T15:46:02.2538786Z inflating: build/bin/cuda_integer_divider_test 2022-09-27T15:46:02.2610914Z inflating: build/bin/cuda_distributions_test 2022-09-27T15:46:02.2673881Z inflating: build/bin/cuda_generator_test 2022-09-27T15:46:02.2726865Z inflating: build/bin/cuda_half_test 2022-09-27T15:46:02.2778720Z inflating: build/bin/cuda_optional_test 2022-09-27T15:46:02.2844300Z inflating: build/bin/cuda_stream_test 2022-09-27T15:46:02.2900153Z inflating: build/bin/cuda_reportMemoryUsage_test 2022-09-27T15:46:02.2955548Z inflating: build/bin/cuda_packedtensoraccessor_test 2022-09-27T15:46:02.3007376Z inflating: build/bin/cuda_cudnn_test 2022-09-27T15:46:02.3063781Z inflating: build/bin/cuda_vectorized_test 2022-09-27T15:46:02.3081049Z inflating: build/bin/tutorial_tensorexpr 2022-09-27T15:46:02.3150465Z inflating: build/bin/ProcessGroupGlooTest 2022-09-27T15:46:02.3212676Z inflating: build/bin/ProcessGroupGlooAsyncTest 2022-09-27T15:46:02.3279385Z inflating: build/bin/ProcessGroupNCCLTest 2022-09-27T15:46:02.3342334Z inflating: build/bin/ProcessGroupNCCLErrorsTest 2022-09-27T15:46:02.3399251Z inflating: build/bin/ProcessGroupUCCTest 2022-09-27T15:46:02.3456741Z inflating: build/bin/test_dist_autograd 2022-09-27T15:46:02.3531400Z inflating: build/bin/test_cpp_rpc 2022-09-27T15:46:02.3534313Z inflating: build/bin/parallel_benchmark 2022-09-27T15:46:02.3608001Z inflating: build/bin/test_mobile_nnc 2022-09-27T15:46:02.3619455Z inflating: build/bin/aot_model_compiler_test 2022-09-27T15:46:02.4539022Z inflating: build/bin/test_tensorexpr 2022-09-27T15:46:02.4917821Z inflating: build/bin/test_lazy 2022-09-27T15:46:02.4923413Z inflating: build/bin/torch_shm_manager 2022-09-27T15:46:02.6228448Z inflating: build/bin/test_api 2022-09-27T15:46:02.7364963Z inflating: build/bin/test_jit 2022-09-27T15:46:02.7366739Z inflating: .pytorch-test-times.json 2022-09-27T15:46:02.7396298Z ##[group]Run df -H 2022-09-27T15:46:02.7396554Z df -H 2022-09-27T15:46:02.7409675Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-09-27T15:46:02.7409976Z env: 2022-09-27T15:46:02.7410205Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:46:02.7410477Z GPU_FLAG: --gpus all 2022-09-27T15:46:02.7410725Z ##[endgroup] 2022-09-27T15:46:02.7449061Z Filesystem Size Used Avail Use% Mounted on 2022-09-27T15:46:02.7449388Z devtmpfs 129G 0 129G 0% /dev 2022-09-27T15:46:02.7449665Z tmpfs 129G 0 129G 0% /dev/shm 2022-09-27T15:46:02.7449943Z tmpfs 129G 549k 129G 1% /run 2022-09-27T15:46:02.7450231Z tmpfs 129G 0 129G 0% /sys/fs/cgroup 2022-09-27T15:46:02.7450502Z /dev/xvda1 162G 30G 132G 19% / 2022-09-27T15:46:02.7493542Z ##[group]Run .github/scripts/parse_ref.py 2022-09-27T15:46:02.7493909Z .github/scripts/parse_ref.py 2022-09-27T15:46:02.7506096Z shell: /usr/bin/bash -e {0} 2022-09-27T15:46:02.7506351Z env: 2022-09-27T15:46:02.7506594Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:46:02.7506848Z GPU_FLAG: --gpus all 2022-09-27T15:46:02.7507101Z ##[endgroup] 2022-09-27T15:46:02.7809228Z ##[group]Run set -x 2022-09-27T15:46:02.7809621Z set -x 2022-09-27T15:46:02.7809859Z  2022-09-27T15:46:02.7810136Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2022-09-27T15:46:02.7810471Z  TEST_COMMAND=.jenkins/pytorch/multigpu-test.sh 2022-09-27T15:46:02.7810829Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2022-09-27T15:46:02.7811157Z  TEST_COMMAND=.jenkins/caffe2/test.sh 2022-09-27T15:46:02.7811419Z else 2022-09-27T15:46:02.7811703Z  TEST_COMMAND=.jenkins/pytorch/test.sh 2022-09-27T15:46:02.7811981Z fi 2022-09-27T15:46:02.7812186Z  2022-09-27T15:46:02.7812510Z COMMIT_MESSAGES=$(git cherry -v "origin/${GIT_DEFAULT_BRANCH:-master}") 2022-09-27T15:46:02.7812966Z  2022-09-27T15:46:02.7813265Z # sanitize the input commit message and PR body here: 2022-09-27T15:46:02.7813545Z # 2022-09-27T15:46:02.7813933Z # trim all new lines from commit messages + PR_BODY to avoid issues with batch environment 2022-09-27T15:46:02.7814438Z # variable copying. see https://github.com/pytorch/pytorch/pull/80043#issuecomment-1167796028 2022-09-27T15:46:02.7814846Z COMMIT_MESSAGES="${COMMIT_MESSAGES//[$'\n\r']}" 2022-09-27T15:46:02.7815165Z PR_BODY="${PR_BODY//[$'\n\r']}" 2022-09-27T15:46:02.7815425Z  2022-09-27T15:46:02.7815761Z # then trim all special characters like single and double quotes to avoid unescaped inputs to 2022-09-27T15:46:02.7816134Z # wreak havoc internally 2022-09-27T15:46:02.7816461Z export COMMIT_MESSAGES="${COMMIT_MESSAGES//[\'\"]}" 2022-09-27T15:46:02.7816797Z export PR_BODY="${PR_BODY//[\'\"]}" 2022-09-27T15:46:02.7817049Z  2022-09-27T15:46:02.7817361Z # detached container should get cleaned up by teardown_ec2_linux 2022-09-27T15:46:02.7817765Z # TODO: Stop building test binaries as part of the build phase 2022-09-27T15:46:02.7818118Z # Used for GPU_FLAG since that doesn't play nice 2022-09-27T15:46:02.7818448Z # shellcheck disable=SC2086,SC2090 2022-09-27T15:46:02.7818752Z container_name=$(docker run \ 2022-09-27T15:46:02.7819014Z  ${GPU_FLAG:-} \ 2022-09-27T15:46:02.7819286Z  -e BUILD_ENVIRONMENT \ 2022-09-27T15:46:02.7819562Z  -e PR_NUMBER \ 2022-09-27T15:46:02.7819832Z  -e GITHUB_ACTIONS \ 2022-09-27T15:46:02.7820077Z  -e BASE_SHA \ 2022-09-27T15:46:02.7820329Z  -e BRANCH \ 2022-09-27T15:46:02.7820574Z  -e SHA1 \ 2022-09-27T15:46:02.7820819Z  -e AWS_DEFAULT_REGION \ 2022-09-27T15:46:02.7821094Z  -e IN_WHEEL_TEST \ 2022-09-27T15:46:02.7821364Z  -e SHARD_NUMBER \ 2022-09-27T15:46:02.7821609Z  -e TEST_CONFIG \ 2022-09-27T15:46:02.7821878Z  -e NUM_TEST_SHARDS \ 2022-09-27T15:46:02.7822140Z  -e PR_BODY \ 2022-09-27T15:46:02.7822391Z  -e COMMIT_MESSAGES \ 2022-09-27T15:46:02.7822683Z  -e PYTORCH_RETRY_TEST_CASES \ 2022-09-27T15:46:02.7823000Z  -e PYTORCH_OVERRIDE_FLAKY_SIGNAL \ 2022-09-27T15:46:02.7823270Z  -e PR_LABELS \ 2022-09-27T15:46:02.7823561Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2022-09-27T15:46:02.7823853Z  -e SCCACHE_BUCKET \ 2022-09-27T15:46:02.7824115Z  -e SCCACHE_S3_KEY_PREFIX \ 2022-09-27T15:46:02.7824389Z  -e XLA_CUDA \ 2022-09-27T15:46:02.7824674Z  -e XLA_CLANG_CACHE_S3_BUCKET_NAME \ 2022-09-27T15:46:02.7825014Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2022-09-27T15:46:02.7825323Z  --ulimit stack=10485760:83886080 \ 2022-09-27T15:46:02.7825640Z  --security-opt seccomp=unconfined \ 2022-09-27T15:46:02.7825949Z  --cap-add=SYS_PTRACE \ 2022-09-27T15:46:02.7826200Z  --ipc=host \ 2022-09-27T15:46:02.7826539Z  --shm-size="${SHM_SIZE}" \ 2022-09-27T15:46:02.7826814Z  --tty \ 2022-09-27T15:46:02.7827039Z  --detach \ 2022-09-27T15:46:02.7827315Z  --name="${container_name}" \ 2022-09-27T15:46:02.7827593Z  --user jenkins \ 2022-09-27T15:46:02.7827897Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2022-09-27T15:46:02.7828243Z  -w /var/lib/jenkins/workspace \ 2022-09-27T15:46:02.7828529Z  "${DOCKER_IMAGE}" 2022-09-27T15:46:02.7828774Z ) 2022-09-27T15:46:02.7829094Z docker exec -t "${container_name}" sh -c "pip install dist/*.whl && ${TEST_COMMAND}" 2022-09-27T15:46:02.7840906Z shell: /usr/bin/bash -e {0} 2022-09-27T15:46:02.7841160Z env: 2022-09-27T15:46:02.7841383Z GIT_DEFAULT_BRANCH: master 2022-09-27T15:46:02.7841748Z GPU_FLAG: --gpus all 2022-09-27T15:46:02.7842082Z BUILD_ENVIRONMENT: linux-bionic-cuda11.6-py3.10-gcc7 2022-09-27T15:46:02.7842382Z PR_NUMBER: 85462 2022-09-27T15:46:02.7842636Z BRANCH: pull/85462 2022-09-27T15:46:02.7842931Z SHA1: 52424e2bf38e454d535881fed9628d3e20f4f944 2022-09-27T15:46:02.7843236Z BASE_SHA: 76d60778eb01b4213735be1c6e126fe2da519b8e 2022-09-27T15:46:02.7843540Z PYTORCH_RETRY_TEST_CASES: 1 2022-09-27T15:46:02.7843826Z PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1 2022-09-27T15:46:02.7844090Z TEST_CONFIG: distributed 2022-09-27T15:46:02.7844343Z SHARD_NUMBER: 2 2022-09-27T15:46:02.7844588Z NUM_TEST_SHARDS: 3 2022-09-27T15:46:02.7846799Z PR_BODY: Hi, we noticed in our team that by using CyclicLR, there is a problem with memory clearance on GPU (probably it will be the case without the GPU as well, but that was our use case) After initializing CyclicLR, GPU memory is not cleared even after the model, optimizer and scheduler are out of scope (e.g. reference count is zero). This is because `__init__` method inside `CyclicLR` creates reference to its own methods and it will not get removed until `gc.collect()` is called manually. This is a problem if people want to test multiple models in one run of a script, after testing the first model, second one will fail on `CUDA out of memory error` because the first one is not cleared from the memory. I propose a simple fix by using `weakref`, similarly as in `_LRScheduler` base class, but if you have any comments I am happy to change it. Here is the code to reproduce the bug: ``` import torch import weakref from transformers import DetrForObjectDetection class X: def __init__(self, optimizer): self.optimizer = optimizer # Will cause cyclic reference. self.func = self.dummy # Will work as expected, memory cleared after instance count is zero. # self.func = weakref.WeakMethod(self.dummy) def dummy(self, x): return 1. def test(): model = DetrForObjectDetection.from_pretrained('facebook/detr-resnet-50') model.to('cuda') optimizer = torch.optim.Adam(model.parameters()) x = X(optimizer) test() print(f'{torch.cuda.memory_reserved()}, {torch.cuda.memory_allocated()}') # Should print (, 0), but with cyclic reference, it will print (, ). ``` 2022-09-27T15:46:02.7849170Z SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2 2022-09-27T15:46:02.7849504Z SCCACHE_S3_KEY_PREFIX: pull 2022-09-27T15:46:02.7849764Z SHM_SIZE: 2g 2022-09-27T15:46:02.7850236Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:e66cf5fa0a4d4ed512901b12ccdab95cca946a29 2022-09-27T15:46:02.7850716Z XLA_CUDA: 2022-09-27T15:46:02.7851073Z XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla 2022-09-27T15:46:02.7851405Z ##[endgroup] 2022-09-27T15:46:02.7880149Z + [[ distributed == \m\u\l\t\i\g\p\u ]] 2022-09-27T15:46:02.7880618Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *onnx* ]] 2022-09-27T15:46:02.7880956Z + TEST_COMMAND=.jenkins/pytorch/test.sh 2022-09-27T15:46:02.7884303Z ++ git cherry -v origin/master 2022-09-27T15:46:02.8127497Z + COMMIT_MESSAGES='+ 871567eae42e57e9926f1a38c0b8d221f672c928 Fix mem leak because of self reference in CyclicLR 2022-09-27T15:46:02.8127968Z + acca92f8843f34854e73ee280c4f87ea280d2914 Typing for new CyclicLR 2022-09-27T15:46:02.8128348Z + 788218632109bd2065d4961cae624c62f106deee Rename scale_fn_* to private form 2022-09-27T15:46:02.8128743Z + 1642fe39e72ecf43de500a53610a076827e792b9 Test CyclicLR cyclic reference 2022-09-27T15:46:02.8129109Z + 52424e2bf38e454d535881fed9628d3e20f4f944 Fix linting' 2022-09-27T15:46:02.8130851Z + COMMIT_MESSAGES='+ 871567eae42e57e9926f1a38c0b8d221f672c928 Fix mem leak because of self reference in CyclicLR+ acca92f8843f34854e73ee280c4f87ea280d2914 Typing for new CyclicLR+ 788218632109bd2065d4961cae624c62f106deee Rename scale_fn_* to private form+ 1642fe39e72ecf43de500a53610a076827e792b9 Test CyclicLR cyclic reference+ 52424e2bf38e454d535881fed9628d3e20f4f944 Fix linting' 2022-09-27T15:46:02.8147131Z + PR_BODY='Hi, we noticed in our team that by using CyclicLR, there is a problem with memory clearance on GPU (probably it will be the case without the GPU as well, but that was our use case) After initializing CyclicLR, GPU memory is not cleared even after the model, optimizer and scheduler are out of scope (e.g. reference count is zero). This is because `__init__` method inside `CyclicLR` creates reference to its own methods and it will not get removed until `gc.collect()` is called manually. This is a problem if people want to test multiple models in one run of a script, after testing the first model, second one will fail on `CUDA out of memory error` because the first one is not cleared from the memory.I propose a simple fix by using `weakref`, similarly as in `_LRScheduler` base class, but if you have any comments I am happy to change it. Here is the code to reproduce the bug:```import torchimport weakreffrom transformers import DetrForObjectDetectionclass X: def __init__(self, optimizer): self.optimizer = optimizer # Will cause cyclic reference. self.func = self.dummy # Will work as expected, memory cleared after instance count is zero. # self.func = weakref.WeakMethod(self.dummy) def dummy(self, x): return 1.def test(): model = DetrForObjectDetection.from_pretrained('\''facebook/detr-resnet-50'\'') model.to('\''cuda'\'') optimizer = torch.optim.Adam(model.parameters()) x = X(optimizer)test()print(f'\''{torch.cuda.memory_reserved()}, {torch.cuda.memory_allocated()}'\'') # Should print (, 0), but with cyclic reference, it will print (, ).```' 2022-09-27T15:46:02.8150237Z + export 'COMMIT_MESSAGES=+ 871567eae42e57e9926f1a38c0b8d221f672c928 Fix mem leak because of self reference in CyclicLR+ acca92f8843f34854e73ee280c4f87ea280d2914 Typing for new CyclicLR+ 788218632109bd2065d4961cae624c62f106deee Rename scale_fn_* to private form+ 1642fe39e72ecf43de500a53610a076827e792b9 Test CyclicLR cyclic reference+ 52424e2bf38e454d535881fed9628d3e20f4f944 Fix linting' 2022-09-27T15:46:02.8151908Z + COMMIT_MESSAGES='+ 871567eae42e57e9926f1a38c0b8d221f672c928 Fix mem leak because of self reference in CyclicLR+ acca92f8843f34854e73ee280c4f87ea280d2914 Typing for new CyclicLR+ 788218632109bd2065d4961cae624c62f106deee Rename scale_fn_* to private form+ 1642fe39e72ecf43de500a53610a076827e792b9 Test CyclicLR cyclic reference+ 52424e2bf38e454d535881fed9628d3e20f4f944 Fix linting' 2022-09-27T15:46:02.8159582Z + export 'PR_BODY=Hi, we noticed in our team that by using CyclicLR, there is a problem with memory clearance on GPU (probably it will be the case without the GPU as well, but that was our use case) After initializing CyclicLR, GPU memory is not cleared even after the model, optimizer and scheduler are out of scope (e.g. reference count is zero). This is because `__init__` method inside `CyclicLR` creates reference to its own methods and it will not get removed until `gc.collect()` is called manually. This is a problem if people want to test multiple models in one run of a script, after testing the first model, second one will fail on `CUDA out of memory error` because the first one is not cleared from the memory.I propose a simple fix by using `weakref`, similarly as in `_LRScheduler` base class, but if you have any comments I am happy to change it. Here is the code to reproduce the bug:```import torchimport weakreffrom transformers import DetrForObjectDetectionclass X: def __init__(self, optimizer): self.optimizer = optimizer # Will cause cyclic reference. self.func = self.dummy # Will work as expected, memory cleared after instance count is zero. # self.func = weakref.WeakMethod(self.dummy) def dummy(self, x): return 1.def test(): model = DetrForObjectDetection.from_pretrained(facebook/detr-resnet-50) model.to(cuda) optimizer = torch.optim.Adam(model.parameters()) x = X(optimizer)test()print(f{torch.cuda.memory_reserved()}, {torch.cuda.memory_allocated()}) # Should print (, 0), but with cyclic reference, it will print (, ).```' 2022-09-27T15:46:02.8165020Z + PR_BODY='Hi, we noticed in our team that by using CyclicLR, there is a problem with memory clearance on GPU (probably it will be the case without the GPU as well, but that was our use case) After initializing CyclicLR, GPU memory is not cleared even after the model, optimizer and scheduler are out of scope (e.g. reference count is zero). This is because `__init__` method inside `CyclicLR` creates reference to its own methods and it will not get removed until `gc.collect()` is called manually. This is a problem if people want to test multiple models in one run of a script, after testing the first model, second one will fail on `CUDA out of memory error` because the first one is not cleared from the memory.I propose a simple fix by using `weakref`, similarly as in `_LRScheduler` base class, but if you have any comments I am happy to change it. Here is the code to reproduce the bug:```import torchimport weakreffrom transformers import DetrForObjectDetectionclass X: def __init__(self, optimizer): self.optimizer = optimizer # Will cause cyclic reference. self.func = self.dummy # Will work as expected, memory cleared after instance count is zero. # self.func = weakref.WeakMethod(self.dummy) def dummy(self, x): return 1.def test(): model = DetrForObjectDetection.from_pretrained(facebook/detr-resnet-50) model.to(cuda) optimizer = torch.optim.Adam(model.parameters()) x = X(optimizer)test()print(f{torch.cuda.memory_reserved()}, {torch.cuda.memory_allocated()}) # Should print (, 0), but with cyclic reference, it will print (, ).```' 2022-09-27T15:46:02.8167316Z +++ nproc --ignore=2 2022-09-27T15:46:02.8189922Z ++ docker run --gpus all -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e PR_BODY -e COMMIT_MESSAGES -e PYTORCH_RETRY_TEST_CASES -e PYTORCH_OVERRIDE_FLAKY_SIGNAL -e PR_LABELS -e MAX_JOBS=30 -e SCCACHE_BUCKET -e SCCACHE_S3_KEY_PREFIX -e XLA_CUDA -e XLA_CLANG_CACHE_S3_BUCKET_NAME --env-file=/tmp/github_env_3133193930 --ulimit stack=10485760:83886080 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --ipc=host --shm-size=2g --tty --detach --name= --user jenkins -v /home/ec2-user/actions-runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:e66cf5fa0a4d4ed512901b12ccdab95cca946a29 2022-09-27T15:46:16.4492120Z + container_name=ac37d1fee4fc78027b66bf2dbe82cdb150df17552519faa479ea1f0aad2016f1 2022-09-27T15:46:16.4493056Z + docker exec -t ac37d1fee4fc78027b66bf2dbe82cdb150df17552519faa479ea1f0aad2016f1 sh -c 'pip install dist/*.whl && .jenkins/pytorch/test.sh' 2022-09-27T15:46:17.0311006Z Processing ./dist/torch-1.13.0a0+git52424e2-cp310-cp310-linux_x86_64.whl 2022-09-27T15:46:17.1218456Z Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.10/site-packages (from torch==1.13.0a0+git52424e2) (4.3.0) 2022-09-27T15:46:18.0563810Z Installing collected packages: torch 2022-09-27T15:46:28.5079258Z Successfully installed torch-1.13.0a0+git52424e2 2022-09-27T15:46:28.6723917Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2022-09-27T15:46:28.6945652Z + TORCH_INSTALL_DIR=/opt/conda/lib/python3.10/site-packages/torch 2022-09-27T15:46:28.6947836Z + TORCH_BIN_DIR=/opt/conda/lib/python3.10/site-packages/torch/bin 2022-09-27T15:46:28.6948347Z + TORCH_LIB_DIR=/opt/conda/lib/python3.10/site-packages/torch/lib 2022-09-27T15:46:28.6948825Z + TORCH_TEST_DIR=/opt/conda/lib/python3.10/site-packages/torch/test 2022-09-27T15:46:28.6949128Z + BUILD_DIR=build 2022-09-27T15:46:28.6949399Z + BUILD_RENAMED_DIR=build_renamed 2022-09-27T15:46:28.6952347Z + BUILD_BIN_DIR=build/bin 2022-09-27T15:46:28.6952936Z + export VALGRIND=ON 2022-09-27T15:46:28.6953196Z + VALGRIND=ON 2022-09-27T15:46:28.6953636Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *clang9* ]] 2022-09-27T15:46:28.6954084Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 != *bazel* ]] 2022-09-27T15:46:28.6954426Z ++ realpath build/custom_test_artifacts 2022-09-27T15:46:28.6961621Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/workspace/build/custom_test_artifacts 2022-09-27T15:46:28.6964924Z ++ dirname .jenkins/pytorch/test.sh 2022-09-27T15:46:28.6972915Z + source .jenkins/pytorch/common.sh 2022-09-27T15:46:28.6977307Z +++ dirname .jenkins/pytorch/common.sh 2022-09-27T15:46:28.6988108Z ++ source .jenkins/pytorch/common_utils.sh 2022-09-27T15:46:28.6990295Z +++ declare -f -t trap_add 2022-09-27T15:46:28.6995070Z ++ set -ex 2022-09-27T15:46:28.6995891Z ++ [[ linux-bionic-cuda11.6-py3.10-gcc7 == *rocm* ]] 2022-09-27T15:46:28.6996481Z ++ BUILD_TEST_LIBTORCH=0 2022-09-27T15:46:28.6998595Z ++ [[ distributed == *xla* ]] 2022-09-27T15:46:28.6999492Z ++ [[ linux-bionic-cuda11.6-py3.10-gcc7 == *centos* ]] 2022-09-27T15:46:28.7000141Z ++ [[ linux-bionic-cuda11.6-py3.10-gcc7 == *linux-bionic* ]] 2022-09-27T15:46:28.7000467Z ++ which conda 2022-09-27T15:46:28.7006970Z /opt/conda/bin/conda 2022-09-27T15:46:28.7008073Z ++ conda install -q -y cmake 2022-09-27T15:46:31.3657663Z Collecting package metadata (current_repodata.json): ...working... done 2022-09-27T15:46:32.0878429Z Solving environment: ...working... done 2022-09-27T15:46:32.1905960Z 2022-09-27T15:46:32.1906494Z ## Package Plan ## 2022-09-27T15:46:32.1906854Z 2022-09-27T15:46:32.1907121Z environment location: /opt/conda 2022-09-27T15:46:32.1907493Z 2022-09-27T15:46:32.1907736Z added / updated specs: 2022-09-27T15:46:32.1908513Z - cmake 2022-09-27T15:46:32.1908769Z 2022-09-27T15:46:32.1908797Z 2022-09-27T15:46:32.1908985Z The following packages will be downloaded: 2022-09-27T15:46:32.1909188Z 2022-09-27T15:46:32.1909323Z package | build 2022-09-27T15:46:32.1911937Z ---------------------------|----------------- 2022-09-27T15:46:32.1912574Z c-ares-1.18.1 | h7f8727e_0 114 KB 2022-09-27T15:46:32.1913024Z certifi-2022.9.14 | py310h06a4308_0 155 KB 2022-09-27T15:46:32.1913418Z cmake-3.22.1 | h1fce559_0 7.3 MB 2022-09-27T15:46:32.1913796Z conda-22.9.0 | py310h06a4308_0 894 KB 2022-09-27T15:46:32.1914178Z expat-2.4.4 | h295c915_0 169 KB 2022-09-27T15:46:32.1914557Z krb5-1.19.2 | hac12032_0 1.2 MB 2022-09-27T15:46:32.1914913Z libcurl-7.82.0 | h0b77cf5_0 342 KB 2022-09-27T15:46:32.1915308Z libedit-3.1.20210910 | h7f8727e_0 166 KB 2022-09-27T15:46:32.1915688Z libev-4.33 | h7f8727e_1 111 KB 2022-09-27T15:46:32.1916126Z libnghttp2-1.46.0 | hce63b2e_0 680 KB 2022-09-27T15:46:32.1916732Z libssh2-1.10.0 | h8f2d780_0 274 KB 2022-09-27T15:46:32.1917131Z libuv-1.40.0 | h7b6447c_0 736 KB 2022-09-27T15:46:32.1917727Z lz4-c-1.9.3 | h295c915_1 185 KB 2022-09-27T15:46:32.1918112Z rhash-1.4.1 | h3c74f83_1 203 KB 2022-09-27T15:46:32.1918492Z zstd-1.5.2 | ha4553b6_0 488 KB 2022-09-27T15:46:32.1918889Z ------------------------------------------------------------ 2022-09-27T15:46:32.1919223Z Total: 12.9 MB 2022-09-27T15:46:32.1919384Z 2022-09-27T15:46:32.1919546Z The following NEW packages will be INSTALLED: 2022-09-27T15:46:32.1919750Z 2022-09-27T15:46:32.1920103Z c-ares pkgs/main/linux-64::c-ares-1.18.1-h7f8727e_0 2022-09-27T15:46:32.1920592Z cmake pkgs/main/linux-64::cmake-3.22.1-h1fce559_0 2022-09-27T15:46:32.1921164Z expat pkgs/main/linux-64::expat-2.4.4-h295c915_0 2022-09-27T15:46:32.1921609Z krb5 pkgs/main/linux-64::krb5-1.19.2-hac12032_0 2022-09-27T15:46:32.1922079Z libcurl pkgs/main/linux-64::libcurl-7.82.0-h0b77cf5_0 2022-09-27T15:46:32.1922578Z libedit pkgs/main/linux-64::libedit-3.1.20210910-h7f8727e_0 2022-09-27T15:46:32.1923036Z libev pkgs/main/linux-64::libev-4.33-h7f8727e_1 2022-09-27T15:46:32.1923526Z libnghttp2 pkgs/main/linux-64::libnghttp2-1.46.0-hce63b2e_0 2022-09-27T15:46:32.1924018Z libssh2 pkgs/main/linux-64::libssh2-1.10.0-h8f2d780_0 2022-09-27T15:46:32.1924489Z libuv pkgs/main/linux-64::libuv-1.40.0-h7b6447c_0 2022-09-27T15:46:32.1924947Z lz4-c pkgs/main/linux-64::lz4-c-1.9.3-h295c915_1 2022-09-27T15:46:32.1925401Z rhash pkgs/main/linux-64::rhash-1.4.1-h3c74f83_1 2022-09-27T15:46:32.1925860Z zstd pkgs/main/linux-64::zstd-1.5.2-ha4553b6_0 2022-09-27T15:46:32.1926068Z 2022-09-27T15:46:32.1926214Z The following packages will be UPDATED: 2022-09-27T15:46:32.1926388Z 2022-09-27T15:46:32.1926671Z certifi 2022.6.15-py310h06a4308_0 --> 2022.9.14-py310h06a4308_0 2022-09-27T15:46:32.1927131Z conda 4.14.0-py310h06a4308_0 --> 22.9.0-py310h06a4308_0 2022-09-27T15:46:32.1927340Z 2022-09-27T15:46:32.1927359Z 2022-09-27T15:46:33.8160871Z Preparing transaction: ...working... done 2022-09-27T15:46:34.3214448Z Verifying transaction: ...working... done 2022-09-27T15:46:35.6203682Z Executing transaction: ...working... done 2022-09-27T15:46:35.7504899Z Retrieving notices: ...working... done 2022-09-27T15:46:35.9429446Z ++ [[ linux-bionic-cuda11.6-py3.10-gcc7 == *centos* ]] 2022-09-27T15:46:35.9429875Z + echo 'Environment variables' 2022-09-27T15:46:35.9430144Z Environment variables 2022-09-27T15:46:35.9430388Z + env 2022-09-27T15:46:35.9439125Z SHARD_NUMBER=2 2022-09-27T15:46:35.9439912Z NV_LIBCUBLAS_DEV_VERSION=11.9.2.110-1 2022-09-27T15:46:35.9440328Z NV_CUDA_COMPAT_PACKAGE=cuda-compat-11-6 2022-09-27T15:46:35.9440701Z LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2022-09-27T15:46:35.9441145Z NV_LIBNCCL_DEV_PACKAGE=libnccl-dev=2.12.10-1+cuda11.6 2022-09-27T15:46:35.9441775Z UCC_HOME=/usr 2022-09-27T15:46:35.9442486Z BUILD_ENVIRONMENT=linux-bionic-cuda11.6-py3.10-gcc7 2022-09-27T15:46:35.9443427Z NV_LIBNPP_DEV_PACKAGE=libnpp-dev-11-6=11.6.3.124-1 2022-09-27T15:46:35.9443986Z INSTALLED_DB=yes 2022-09-27T15:46:35.9444330Z HOSTNAME=ac37d1fee4fc 2022-09-27T15:46:35.9444596Z GITHUB_REF_NAME=85462/merge 2022-09-27T15:46:35.9444910Z GITHUB_API_URL=https://api.github.com 2022-09-27T15:46:35.9445191Z OPENSSL_DIR=/opt/openssl 2022-09-27T15:46:35.9445498Z UCC_COMMIT=12944da33f911daf505d9bbc51411233d0ed85e1 2022-09-27T15:46:35.9446099Z GITHUB_STEP_SUMMARY=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/step_summary_1a931f01-bb6b-4a63-90c0-f50a70553762 2022-09-27T15:46:35.9446506Z CUDA_PATH=/usr/local/cuda 2022-09-27T15:46:35.9446993Z GITHUB_ACTION_PATH=/home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-linux 2022-09-27T15:46:35.9447367Z GITHUB_RUN_ATTEMPT=2 2022-09-27T15:46:35.9447842Z TEST_CONFIG=distributed 2022-09-27T15:46:35.9448359Z NV_LIBNPP_VERSION=11.6.3.124-1 2022-09-27T15:46:35.9448742Z NV_NVPROF_DEV_PACKAGE=cuda-nvprof-11-6=11.6.124-1 2022-09-27T15:46:35.9449042Z GITHUB_REPOSITORY_OWNER=pytorch 2022-09-27T15:46:35.9449318Z GITHUB_ACTIONS=true 2022-09-27T15:46:35.9449580Z NVIDIA_VISIBLE_DEVICES=all 2022-09-27T15:46:35.9449861Z NV_NVPROF_VERSION=11.6.124-1 2022-09-27T15:46:35.9450172Z NV_LIBCUSPARSE_VERSION=11.7.2.124-1 2022-09-27T15:46:35.9450432Z CI=true 2022-09-27T15:46:35.9450667Z PYTORCH_OVERRIDE_FLAKY_SIGNAL=1 2022-09-27T15:46:35.9451063Z NV_LIBCUBLAS_DEV_PACKAGE=libcublas-dev-11-6=11.9.2.110-1 2022-09-27T15:46:35.9451368Z BRANCH=pull/85462 2022-09-27T15:46:35.9451693Z GITHUB_HEAD_REF=cycliclr-memory-fix 2022-09-27T15:46:35.9452015Z UCX_COMMIT=31e74cac7bee0ef66bef2af72e7d86d9c282e5ab 2022-09-27T15:46:35.9452448Z GITHUB_ACTOR=kongzii 2022-09-27T15:46:35.9452752Z CMAKE_CUDA_COMPILER_LAUNCHER=/opt/cache/bin/sccache 2022-09-27T15:46:35.9453028Z GITHUB_ACTION_REF= 2022-09-27T15:46:35.9453314Z NCCL_VERSION=2.12.10-1 2022-09-27T15:46:35.9453569Z GITHUB_ACTION=__self 2022-09-27T15:46:35.9453796Z VALGRIND=ON 2022-09-27T15:46:35.9454069Z GITHUB_REF_PROTECTED=false 2022-09-27T15:46:35.9454516Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2022-09-27T15:46:35.9456909Z *** 2022-09-27T15:46:35.9457177Z INSTALLED_VISION=yes 2022-09-27T15:46:35.9457429Z NVARCH=x86_64 2022-09-27T15:46:35.9457734Z NV_LIBCUSPARSE_DEV_VERSION=11.7.2.124-1 2022-09-27T15:46:35.9458021Z HOME=/var/lib/jenkins 2022-09-27T15:46:35.9458292Z CARGO_NET_GIT_FETCH_WITH_CLI=true 2022-09-27T15:46:35.9458558Z GITHUB_ACTION_REPOSITORY= 2022-09-27T15:46:35.9458827Z GITHUB_REF_TYPE=branch 2022-09-27T15:46:35.9459137Z NV_LIBNCCL_PACKAGE_VERSION=2.12.10-1 2022-09-27T15:46:35.9459411Z GITHUB_RETENTION_DAYS=90 2022-09-27T15:46:35.9459793Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2022-09-27T15:46:35.9460202Z NV_LIBNCCL_PACKAGE=libnccl2=2.12.10-1+cuda11.6 2022-09-27T15:46:35.9460755Z GITHUB_ENV=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_env_1a931f01-bb6b-4a63-90c0-f50a70553762 2022-09-27T15:46:35.9461146Z DEBIAN_FRONTEND=noninteractive 2022-09-27T15:46:35.9461497Z NV_LIBNCCL_DEV_PACKAGE_NAME=libnccl-dev 2022-09-27T15:46:35.9461845Z GITHUB_REF=refs/pull/85462/merge 2022-09-27T15:46:35.9462148Z NV_CUDA_LIB_VERSION=11.6.2-1 2022-09-27T15:46:35.9462447Z GITHUB_SHA=1faa2af6dbb8dd899ab20874e9966185467c5883 2022-09-27T15:46:35.9462756Z INSTALLED_PROTOBUF=yes 2022-09-27T15:46:35.9463017Z GITHUB_RUN_ID=3133193930 2022-09-27T15:46:35.9463345Z NV_LIBNPP_PACKAGE=libnpp-11-6=11.6.3.124-1 2022-09-27T15:46:35.9463647Z NV_LIBNCCL_PACKAGE_NAME=libnccl2 2022-09-27T15:46:35.9463947Z LIBRARY_PATH=/usr/local/cuda/lib64/stubs 2022-09-27T15:46:35.9464250Z NV_NVTX_VERSION=11.6.124-1 2022-09-27T15:46:35.9464581Z GITHUB_SERVER_URL=https://github.com 2022-09-27T15:46:35.9464872Z MAX_JOBS=30 2022-09-27T15:46:35.9465154Z NV_LIBCUBLAS_VERSION=11.9.2.110-1 2022-09-27T15:46:35.9465556Z NV_LIBCUBLAS_PACKAGE=libcublas-11-6=11.9.2.110-1 2022-09-27T15:46:35.9466071Z GITHUB_EVENT_PATH=/home/ec2-user/actions-runner/_work/_temp/_github_workflow/event.json 2022-09-27T15:46:35.9466420Z UCX_HOME=/usr 2022-09-27T15:46:35.9466687Z PYTORCH_RETRY_TEST_CASES=1 2022-09-27T15:46:35.9467030Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2022-09-27T15:46:35.9467392Z BASE_SHA=76d60778eb01b4213735be1c6e126fe2da519b8e 2022-09-27T15:46:35.9467731Z NV_CUDA_CUDART_DEV_VERSION=11.6.55-1 2022-09-27T15:46:35.9471746Z PR_BODY=Hi, we noticed in our team that by using CyclicLR, there is a problem with memory clearance on GPU (probably it will be the case without the GPU as well, but that was our use case) After initializing CyclicLR, GPU memory is not cleared even after the model, optimizer and scheduler are out of scope (e.g. reference count is zero). This is because `__init__` method inside `CyclicLR` creates reference to its own methods and it will not get removed until `gc.collect()` is called manually. This is a problem if people want to test multiple models in one run of a script, after testing the first model, second one will fail on `CUDA out of memory error` because the first one is not cleared from the memory.I propose a simple fix by using `weakref`, similarly as in `_LRScheduler` base class, but if you have any comments I am happy to change it. Here is the code to reproduce the bug:```import torchimport weakreffrom transformers import DetrForObjectDetectionclass X: def __init__(self, optimizer): self.optimizer = optimizer # Will cause cyclic reference. self.func = self.dummy # Will work as expected, memory cleared after instance count is zero. # self.func = weakref.WeakMethod(self.dummy) def dummy(self, x): return 1.def test(): model = DetrForObjectDetection.from_pretrained(facebook/detr-resnet-50) model.to(cuda) optimizer = torch.optim.Adam(model.parameters()) x = X(optimizer)test()print(f{torch.cuda.memory_reserved()}, {torch.cuda.memory_allocated()}) # Should print (, 0), but with cyclic reference, it will print (, ).``` 2022-09-27T15:46:35.9474415Z GITHUB_BASE_REF=master 2022-09-27T15:46:35.9474655Z TERM=xterm 2022-09-27T15:46:35.9474888Z XLA_CUDA= 2022-09-27T15:46:35.9475187Z NV_NVML_DEV_VERSION=11.6.55-1 2022-09-27T15:46:35.9475464Z TORCH_CUDA_ARCH_LIST=Maxwell 2022-09-27T15:46:35.9475739Z CUDA_VERSION=11.6.2 2022-09-27T15:46:35.9476100Z NV_LIBCUBLAS_PACKAGE_NAME=libcublas-11-6 2022-09-27T15:46:35.9476401Z OPENSSL_ROOT_DIR=/opt/openssl 2022-09-27T15:46:35.9476969Z GITHUB_PATH=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/add_path_1a931f01-bb6b-4a63-90c0-f50a70553762 2022-09-27T15:46:35.9477377Z GITHUB_JOB=test 2022-09-27T15:46:35.9477624Z SCCACHE_S3_KEY_PREFIX=pull 2022-09-27T15:46:35.9478362Z COMMIT_MESSAGES=+ 871567eae42e57e9926f1a38c0b8d221f672c928 Fix mem leak because of self reference in CyclicLR+ acca92f8843f34854e73ee280c4f87ea280d2914 Typing for new CyclicLR+ 788218632109bd2065d4961cae624c62f106deee Rename scale_fn_* to private form+ 1642fe39e72ecf43de500a53610a076827e792b9 Test CyclicLR cyclic reference+ 52424e2bf38e454d535881fed9628d3e20f4f944 Fix linting 2022-09-27T15:46:35.9479092Z NVIDIA_DRIVER_CAPABILITIES=compute,utility 2022-09-27T15:46:35.9479392Z NUM_TEST_SHARDS=3 2022-09-27T15:46:35.9479629Z PR_NUMBER=85462 2022-09-27T15:46:35.9479871Z SHLVL=1 2022-09-27T15:46:35.9480243Z NV_LIBCUBLAS_DEV_PACKAGE_NAME=libcublas-dev-11-6 2022-09-27T15:46:35.9480572Z GITHUB_REPOSITORY=pytorch/pytorch 2022-09-27T15:46:35.9481205Z NVIDIA_REQUIRE_CUDA=cuda>=11.6 brand=tesla,driver>=418,driver<419 brand=tesla,driver>=450,driver<451 brand=tesla,driver>=470,driver<471 brand=unknown,driver>=470,driver<471 brand=nvidia,driver>=470,driver<471 brand=nvidiartx,driver>=470,driver<471 brand=quadrortx,driver>=470,driver<471 2022-09-27T15:46:35.9481834Z NV_LIBNPP_DEV_VERSION=11.6.3.124-1 2022-09-27T15:46:35.9482151Z SHA1=52424e2bf38e454d535881fed9628d3e20f4f944 2022-09-27T15:46:35.9482444Z GITHUB_EVENT_NAME=pull_request 2022-09-27T15:46:35.9482777Z NV_CUDA_CUDART_VERSION=11.6.55-1 2022-09-27T15:46:35.9483154Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2022-09-27T15:46:35.9483441Z GITHUB_RUN_NUMBER=50832 2022-09-27T15:46:35.9483712Z GITHUB_WORKFLOW=pull 2022-09-27T15:46:35.9484152Z PATH=/opt/cache/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2022-09-27T15:46:35.9484634Z NV_LIBNCCL_DEV_PACKAGE_VERSION=2.12.10-1 2022-09-27T15:46:35.9485080Z GITHUB_WORKSPACE=/home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-09-27T15:46:35.9485447Z GITHUB_TRIGGERING_ACTOR=albanD 2022-09-27T15:46:35.9485720Z _=/usr/bin/env 2022-09-27T15:46:35.9486006Z + echo 'Testing pytorch' 2022-09-27T15:46:35.9486274Z Testing pytorch 2022-09-27T15:46:35.9486564Z + export LANG=C.UTF-8 2022-09-27T15:46:35.9486826Z + LANG=C.UTF-8 2022-09-27T15:46:35.9487077Z + PR_NUMBER=85462 2022-09-27T15:46:35.9487354Z + [[ distributed == \d\e\f\a\u\l\t ]] 2022-09-27T15:46:35.9487717Z + [[ distributed == \d\i\s\t\r\i\b\u\t\e\d ]] 2022-09-27T15:46:35.9488156Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *rocm* ]] 2022-09-27T15:46:35.9488488Z + [[ distributed == \s\l\o\w ]] 2022-09-27T15:46:35.9488903Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *slow-gradcheck* ]] 2022-09-27T15:46:35.9489365Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *cuda* ]] 2022-09-27T15:46:35.9489732Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2022-09-27T15:46:35.9490046Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2022-09-27T15:46:35.9490477Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *cuda11* ]] 2022-09-27T15:46:35.9490815Z + export BUILD_SPLIT_CUDA=ON 2022-09-27T15:46:35.9491077Z + BUILD_SPLIT_CUDA=ON 2022-09-27T15:46:35.9491362Z + [[ distributed == *crossref* ]] 2022-09-27T15:46:35.9491655Z + [[ distributed == *dynamo* ]] 2022-09-27T15:46:35.9492030Z + [[ -n 85462 ]] 2022-09-27T15:46:35.9492285Z + [[ -z '' ]] 2022-09-27T15:46:35.9492589Z + export PYTORCH_TEST_SKIP_CUDA_MEM_LEAK_CHECK=1 2022-09-27T15:46:35.9492947Z + PYTORCH_TEST_SKIP_CUDA_MEM_LEAK_CHECK=1 2022-09-27T15:46:35.9493354Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *rocm* ]] 2022-09-27T15:46:35.9493810Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 != *-bazel-* ]] 2022-09-27T15:46:35.9494188Z + pip_install --user ninja 2022-09-27T15:46:35.9494558Z + pip install --progress-bar off --user ninja 2022-09-27T15:46:36.5173525Z Collecting ninja 2022-09-27T15:46:36.5373397Z Downloading ninja-1.10.2.3-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2022-09-27T15:46:37.4144604Z Installing collected packages: ninja 2022-09-27T15:46:37.4247414Z  WARNING: The script ninja is installed in '/var/lib/jenkins/.local/bin' which is not on PATH. 2022-09-27T15:46:37.4248303Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-09-27T15:46:37.4306474Z Successfully installed ninja-1.10.2.3 2022-09-27T15:46:37.4988932Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2022-09-27T15:46:37.4990117Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2022-09-27T15:46:37.4991549Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *asan* ]] 2022-09-27T15:46:37.4992291Z + [[ distributed == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2022-09-27T15:46:37.4992636Z + [[ distributed == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2022-09-27T15:46:37.4997114Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *tbb* ]] 2022-09-27T15:46:37.5013141Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *libtorch* ]] 2022-09-27T15:46:37.5014088Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *-bazel-* ]] 2022-09-27T15:46:37.5016241Z + cd test 2022-09-27T15:46:37.5017076Z + python -c 'import torch; print(torch.__config__.show())' 2022-09-27T15:46:39.1695046Z PyTorch built with: 2022-09-27T15:46:39.1695772Z - GCC 7.5 2022-09-27T15:46:39.1696439Z - C++ Version: 201402 2022-09-27T15:46:39.1697203Z - Intel(R) oneAPI Math Kernel Library Version 2022.0-Product Build 20211112 for Intel(R) 64 architecture applications 2022-09-27T15:46:39.1697787Z - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815) 2022-09-27T15:46:39.1698198Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2022-09-27T15:46:39.1698572Z - LAPACK is enabled (usually provided by MKL) 2022-09-27T15:46:39.1698887Z - NNPACK is enabled 2022-09-27T15:46:39.1699219Z - CPU capability usage: AVX2 2022-09-27T15:46:39.1699529Z - CUDA Runtime 11.6 2022-09-27T15:46:39.1699905Z - NVCC architecture flags: -gencode;arch=compute_52,code=sm_52 2022-09-27T15:46:39.1700300Z - CuDNN 8.3.2 (built against CUDA 11.5) 2022-09-27T15:46:39.1700601Z - Magma 2.6.1 2022-09-27T15:46:39.1703758Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.6, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Werror -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 2022-09-27T15:46:39.1705992Z 2022-09-27T15:46:39.3916951Z + cd test 2022-09-27T15:46:39.3917517Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2022-09-27T15:46:40.9446841Z ATen/Parallel: 2022-09-27T15:46:40.9447204Z at::get_num_threads() : 16 2022-09-27T15:46:40.9447478Z at::get_num_interop_threads() : 16 2022-09-27T15:46:40.9447776Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2022-09-27T15:46:40.9448053Z omp_get_max_threads() : 16 2022-09-27T15:46:40.9448692Z Intel(R) oneAPI Math Kernel Library Version 2022.0-Product Build 20211112 for Intel(R) 64 architecture applications 2022-09-27T15:46:40.9449062Z mkl_get_max_threads() : 16 2022-09-27T15:46:40.9449508Z Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815) 2022-09-27T15:46:40.9449875Z std::thread::hardware_concurrency() : 32 2022-09-27T15:46:40.9450182Z Environment variables: 2022-09-27T15:46:40.9450449Z OMP_NUM_THREADS : [not set] 2022-09-27T15:46:40.9450715Z MKL_NUM_THREADS : [not set] 2022-09-27T15:46:40.9450982Z ATen parallel backend: OpenMP 2022-09-27T15:46:40.9451160Z 2022-09-27T15:46:41.1514915Z + [[ distributed == *deploy* ]] 2022-09-27T15:46:41.1515243Z + [[ distributed == *backward* ]] 2022-09-27T15:46:41.1515539Z + [[ distributed == *xla* ]] 2022-09-27T15:46:41.1515831Z + [[ distributed == \j\i\t\_\l\e\g\a\c\y ]] 2022-09-27T15:46:41.1516347Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *libtorch* ]] 2022-09-27T15:46:41.1516668Z + [[ distributed == distributed ]] 2022-09-27T15:46:41.1516940Z + install_torchdynamo 2022-09-27T15:46:41.1517183Z + local commit 2022-09-27T15:46:41.1519445Z ++ get_pinned_commit torchdynamo 2022-09-27T15:46:41.1519768Z ++ cat .github/ci_commit_pins/torchdynamo.txt 2022-09-27T15:46:41.1534191Z + commit=41c44bc1d080d6cf063419a4166732b983b84eef 2022-09-27T15:46:41.1534767Z + pip_install --user git+https://github.com/pytorch/torchdynamo.git@41c44bc1d080d6cf063419a4166732b983b84eef 2022-09-27T15:46:41.1535501Z + pip install --progress-bar off --user git+https://github.com/pytorch/torchdynamo.git@41c44bc1d080d6cf063419a4166732b983b84eef 2022-09-27T15:46:41.6417105Z Collecting git+https://github.com/pytorch/torchdynamo.git@41c44bc1d080d6cf063419a4166732b983b84eef 2022-09-27T15:46:41.6422644Z Cloning https://github.com/pytorch/torchdynamo.git (to revision 41c44bc1d080d6cf063419a4166732b983b84eef) to /tmp/pip-req-build-wekvzcja 2022-09-27T15:46:41.6442328Z Running command git clone --filter=blob:none --quiet https://github.com/pytorch/torchdynamo.git /tmp/pip-req-build-wekvzcja 2022-09-27T15:46:42.4895357Z Running command git rev-parse -q --verify 'sha^41c44bc1d080d6cf063419a4166732b983b84eef' 2022-09-27T15:46:42.4916366Z Running command git fetch -q https://github.com/pytorch/torchdynamo.git 41c44bc1d080d6cf063419a4166732b983b84eef 2022-09-27T15:46:42.8203410Z Running command git checkout -q 41c44bc1d080d6cf063419a4166732b983b84eef 2022-09-27T15:46:43.1282302Z Resolved https://github.com/pytorch/torchdynamo.git to commit 41c44bc1d080d6cf063419a4166732b983b84eef 2022-09-27T15:46:45.5584332Z Preparing metadata (setup.py) ... [?25l- done 2022-09-27T15:46:45.5653156Z [?25hRequirement already satisfied: torch>=1.12.0 in /opt/conda/lib/python3.10/site-packages (from torchdynamo==1.13.0.dev0) (1.13.0a0+git52424e2) 2022-09-27T15:46:45.5657553Z Requirement already satisfied: numpy in /opt/conda/lib/python3.10/site-packages (from torchdynamo==1.13.0.dev0) (1.21.2) 2022-09-27T15:46:45.6081757Z Collecting tabulate 2022-09-27T15:46:45.6278124Z Downloading tabulate-0.8.10-py3-none-any.whl (29 kB) 2022-09-27T15:46:45.6346034Z Requirement already satisfied: pyyaml in /opt/conda/lib/python3.10/site-packages/PyYAML-6.0-py3.10-linux-x86_64.egg (from torchdynamo==1.13.0.dev0) (6.0) 2022-09-27T15:46:45.6350310Z Requirement already satisfied: sympy in /opt/conda/lib/python3.10/site-packages (from torchdynamo==1.13.0.dev0) (1.11.1) 2022-09-27T15:46:45.6376384Z Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.10/site-packages (from torch>=1.12.0->torchdynamo==1.13.0.dev0) (4.3.0) 2022-09-27T15:46:45.6409258Z Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.10/site-packages (from sympy->torchdynamo==1.13.0.dev0) (1.2.1) 2022-09-27T15:46:45.6538947Z Building wheels for collected packages: torchdynamo 2022-09-27T15:46:50.0840496Z Building wheel for torchdynamo (setup.py) ... [?25l- \ | / - done 2022-09-27T15:46:50.0937463Z [?25h Created wheel for torchdynamo: filename=torchdynamo-1.13.0.dev0-cp310-cp310-linux_x86_64.whl size=2600957 sha256=ade44f5b279322939ee80fdbff39b9134208703a26f66cb50f48044b9fb55a55 2022-09-27T15:46:50.0939592Z Stored in directory: /var/lib/jenkins/.cache/pip/wheels/2e/47/4b/a72e6a8c4801cae81c62fd871ce3601d87ba0b7e2d5534e15c 2022-09-27T15:46:50.0964380Z Successfully built torchdynamo 2022-09-27T15:46:50.9633254Z Installing collected packages: tabulate, torchdynamo 2022-09-27T15:46:51.3273826Z Successfully installed tabulate-0.8.10 torchdynamo-1.13.0.dev0 2022-09-27T15:46:51.4126305Z + test_distributed 2022-09-27T15:46:51.4126982Z + echo 'Testing distributed python tests' 2022-09-27T15:46:51.4127296Z Testing distributed python tests 2022-09-27T15:46:51.4127742Z + python test/run_test.py --distributed-tests --shard 2 3 --verbose 2022-09-27T15:46:53.6293194Z Ignoring disabled issues: [] 2022-09-27T15:46:53.6653883Z /var/lib/jenkins/workspace/test/run_test.py:960: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-09-27T15:46:53.6654435Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-09-27T15:46:53.6656081Z Found test time stats from artifacts 2022-09-27T15:46:53.6658071Z Selected tests: 2022-09-27T15:46:53.6658379Z distributed/rpc/cuda/test_tensorpipe_agent 2022-09-27T15:46:53.6658693Z distributed/fsdp/test_fsdp_core 2022-09-27T15:46:53.6659018Z distributed/fsdp/test_fsdp_state_dict 2022-09-27T15:46:53.6659332Z distributed/fsdp/test_fsdp_optim_state 2022-09-27T15:46:53.6659655Z distributed/_shard/sharded_tensor/test_sharded_tensor 2022-09-27T15:46:53.6659978Z distributed/test_c10d_pypg 2022-09-27T15:46:53.6660254Z distributed/fsdp/test_wrap 2022-09-27T15:46:53.6660519Z distributed/fsdp/test_fsdp_misc 2022-09-27T15:46:53.6660828Z distributed/fsdp/test_fsdp_grad_acc 2022-09-27T15:46:53.6661126Z distributed/test_c10d_spawn_nccl 2022-09-27T15:46:53.6661429Z distributed/fsdp/test_fsdp_freezing_weights 2022-09-27T15:46:53.6661737Z distributed/fsdp/test_fsdp_comm 2022-09-27T15:46:53.6662034Z distributed/fsdp/test_fsdp_exec_order 2022-09-27T15:46:53.6662341Z distributed/fsdp/test_fsdp_checkpoint 2022-09-27T15:46:53.6662624Z distributed/fsdp/test_fsdp_meta 2022-09-27T15:46:53.6662948Z distributed/_shard/sharded_tensor/ops/test_matrix_ops 2022-09-27T15:46:53.6663290Z distributed/fsdp/test_fsdp_ignored_modules 2022-09-27T15:46:53.6663633Z distributed/_shard/checkpoint/test_file_system_checkpoint 2022-09-27T15:46:53.6663969Z distributed/fsdp/test_fsdp_memory 2022-09-27T15:46:53.6664294Z distributed/_shard/sharding_plan/test_sharding_plan 2022-09-27T15:46:53.6664808Z distributed/_shard/test_partial_tensor 2022-09-27T15:46:53.6665140Z distributed/fsdp/test_fsdp_apply 2022-09-27T15:46:53.6665466Z distributed/_shard/sharded_tensor/ops/test_binary_cmp 2022-09-27T15:46:53.6665768Z distributed/fsdp/test_fsdp_input 2022-09-27T15:46:53.6666084Z distributed/_shard/sharded_tensor/ops/test_linear 2022-09-27T15:46:53.6666427Z distributed/_shard/sharded_tensor/ops/test_init 2022-09-27T15:46:53.6666770Z distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 2022-09-27T15:46:53.6667146Z distributed/_shard/sharded_tensor/ops/test_embedding_bag 2022-09-27T15:46:53.6667493Z distributed/fsdp/test_fsdp_multiple_forward 2022-09-27T15:46:53.6667813Z distributed/fsdp/test_fsdp_pure_fp16 2022-09-27T15:46:53.6668109Z distributed/elastic/timer/local_timer_test 2022-09-27T15:46:53.6668547Z distributed/fsdp/test_fsdp_traversal 2022-09-27T15:46:53.6668861Z distributed/elastic/utils/distributed_test 2022-09-27T15:46:53.6669202Z distributed/_shard/sharded_optim/test_sharded_optim 2022-09-27T15:46:53.6669579Z distributed/fsdp/test_flatten_params_wrapper 2022-09-27T15:46:53.6669902Z distributed/fsdp/test_checkpoint_wrapper 2022-09-27T15:46:53.6670195Z distributed/elastic/utils/logging_test 2022-09-27T15:46:53.6670487Z distributed/test_launcher 2022-09-27T15:46:53.6671029Z distributed/_shard/checkpoint/test_utils 2022-09-27T15:46:53.6671325Z distributed/test_nccl 2022-09-27T15:46:53.6671617Z distributed/_shard/sharded_tensor/ops/test_math_ops 2022-09-27T15:46:53.6671973Z distributed/elastic/events/lib_test 2022-09-27T15:46:53.6672369Z distributed/pipeline/sync/skip/test_api 2022-09-27T15:46:53.6672669Z distributed/pipeline/sync/skip/test_leak 2022-09-27T15:46:53.6672993Z distributed/pipeline/sync/skip/test_tracker 2022-09-27T15:46:53.6673311Z distributed/pipeline/sync/test_bugs 2022-09-27T15:46:53.6673623Z distributed/pipeline/sync/test_deferred_batch_norm 2022-09-27T15:46:53.6673959Z distributed/pipeline/sync/test_microbatch 2022-09-27T15:46:53.6674282Z distributed/pipeline/sync/test_pipeline 2022-09-27T15:46:53.6674578Z distributed/pipeline/sync/test_worker 2022-09-27T15:46:53.6799358Z Prioritized test from test file changes. 2022-09-27T15:46:53.6799662Z reordering tests for PR: 2022-09-27T15:46:53.6800229Z prioritized: ['distributed/fsdp/test_fsdp_optim_state', 'distributed/fsdp/test_checkpoint_wrapper'] 2022-09-27T15:46:53.6804766Z the rest: ['distributed/rpc/cuda/test_tensorpipe_agent', 'distributed/fsdp/test_fsdp_core', 'distributed/fsdp/test_fsdp_state_dict', 'distributed/_shard/sharded_tensor/test_sharded_tensor', 'distributed/test_c10d_pypg', 'distributed/fsdp/test_wrap', 'distributed/fsdp/test_fsdp_misc', 'distributed/fsdp/test_fsdp_grad_acc', 'distributed/test_c10d_spawn_nccl', 'distributed/fsdp/test_fsdp_freezing_weights', 'distributed/fsdp/test_fsdp_comm', 'distributed/fsdp/test_fsdp_exec_order', 'distributed/fsdp/test_fsdp_checkpoint', 'distributed/fsdp/test_fsdp_meta', 'distributed/_shard/sharded_tensor/ops/test_matrix_ops', 'distributed/fsdp/test_fsdp_ignored_modules', 'distributed/_shard/checkpoint/test_file_system_checkpoint', 'distributed/fsdp/test_fsdp_memory', 'distributed/_shard/sharding_plan/test_sharding_plan', 'distributed/_shard/test_partial_tensor', 'distributed/fsdp/test_fsdp_apply', 'distributed/_shard/sharded_tensor/ops/test_binary_cmp', 'distributed/fsdp/test_fsdp_input', 'distributed/_shard/sharded_tensor/ops/test_linear', 'distributed/_shard/sharded_tensor/ops/test_init', 'distributed/_shard/sharded_tensor/test_sharded_tensor_reshard', 'distributed/_shard/sharded_tensor/ops/test_embedding_bag', 'distributed/fsdp/test_fsdp_multiple_forward', 'distributed/fsdp/test_fsdp_pure_fp16', 'distributed/elastic/timer/local_timer_test', 'distributed/fsdp/test_fsdp_traversal', 'distributed/elastic/utils/distributed_test', 'distributed/_shard/sharded_optim/test_sharded_optim', 'distributed/fsdp/test_flatten_params_wrapper', 'distributed/elastic/utils/logging_test', 'distributed/test_launcher', 'distributed/_shard/checkpoint/test_utils', 'distributed/test_nccl', 'distributed/_shard/sharded_tensor/ops/test_math_ops', 'distributed/elastic/events/lib_test', 'distributed/pipeline/sync/skip/test_api', 'distributed/pipeline/sync/skip/test_leak', 'distributed/pipeline/sync/skip/test_tracker', 'distributed/pipeline/sync/test_bugs', 'distributed/pipeline/sync/test_deferred_batch_norm', 'distributed/pipeline/sync/test_microbatch', 'distributed/pipeline/sync/test_pipeline', 'distributed/pipeline/sync/test_worker'] 2022-09-27T15:46:53.6807618Z 2022-09-27T15:46:53.6808153Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-09-27T15:46:53.7017167Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-09-27T15:46:53.7220757Z Running distributed/fsdp/test_fsdp_optim_state ... [2022-09-27 15:46:53.721712] 2022-09-27T15:46:53.7221474Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_optim_state.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 15:46:53.721774] 2022-09-27T15:46:55.5857582Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_optim_state 2022-09-27T15:46:55.5887182Z 2022-09-27T15:46:55.5887501Z Running tests... 2022-09-27T15:46:55.5887945Z ---------------------------------------------------------------------- 2022-09-27T15:46:55.5893763Z test_flatten_sharded_optim_state_dict_nested (__main__.TestFSDPOptimState) 2022-09-27T15:46:57.1034469Z Tests :meth:`flatten_sharded_optim_state_dict` for an FSDP-root ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:46:57.1220901Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 652 2022-09-27T15:46:57.1226870Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 653 2022-09-27T15:46:58.7461271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:46:58.7461798Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:46:58.7462396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:46:58.7462868Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:46:58.7681808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:46:58.7682272Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:46:58.7685257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:46:58.7685748Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:46:59.0088201Z dist init r=1, world=2 2022-09-27T15:46:59.0092716Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:46:59.0124118Z dist init r=0, world=2 2022-09-27T15:46:59.0128965Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:46:59.0129765Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:46:59.0195272Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:00.3902716Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:47:00.3903360Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:47:00.9948011Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:00.9948573Z warnings.warn( 2022-09-27T15:47:00.9949326Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:00.9949836Z warnings.warn( 2022-09-27T15:47:01.5325707Z ok (5.943s) 2022-09-27T15:47:01.5333048Z test_flatten_sharded_optim_state_dict_transformer (__main__.TestFSDPOptimState) 2022-09-27T15:47:01.5344021Z Tests :meth:`flatten_sharded_optim_state_dict` for an FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 737 2022-09-27T15:47:01.5350731Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 738 2022-09-27T15:47:03.2066018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:03.2066497Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:03.2067747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:03.2068221Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:03.2107760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:03.2108198Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:03.2110929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:03.2111667Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:03.4570756Z dist init r=0, world=2 2022-09-27T15:47:03.4574513Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:47:03.4637735Z dist init r=1, world=2 2022-09-27T15:47:03.4642454Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:47:03.4643576Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:03.4677575Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:04.8482401Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:47:04.8482911Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:47:06.7453496Z ok (5.213s) 2022-09-27T15:47:06.7462439Z test_full_optim_state_dict_keys (__main__.TestFSDPOptimState) 2022-09-27T15:47:06.7477085Z Tests that the parameter keys returned by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 822 2022-09-27T15:47:06.7484034Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 823 2022-09-27T15:47:08.4263589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:08.4264097Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:08.4266491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:08.4266974Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:08.4334826Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:08.4335285Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:08.4338111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:08.4338609Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:08.6828922Z dist init r=1, world=2 2022-09-27T15:47:08.6832597Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:47:08.6832994Z dist init r=0, world=2 2022-09-27T15:47:08.6838890Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:47:08.6839708Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:08.6935555Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:10.0527570Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:47:10.0528097Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:47:11.0570829Z ok (4.312s) 2022-09-27T15:47:11.0578478Z test_full_optim_state_dict_nested_invalid (__main__.TestFSDPOptimState) 2022-09-27T15:47:11.0592971Z Tests that :meth:`full_optim_state_dict` raises an error when ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 907 2022-09-27T15:47:11.0599876Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 908 2022-09-27T15:47:12.7507302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:12.7507808Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:12.7508611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:12.7509076Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:12.7595184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:12.7595675Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:12.7598213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:12.7598666Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:13.0108046Z dist init r=1, world=2 2022-09-27T15:47:13.0112656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:47:13.0118655Z dist init r=0, world=2 2022-09-27T15:47:13.0123487Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:47:13.0124287Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:13.0215274Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:14.3851639Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:47:14.3852293Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:47:15.3686880Z ok (4.311s) 2022-09-27T15:47:15.3700172Z test_optim_input_warning (__main__.TestFSDPOptimState) 2022-09-27T15:47:15.3714484Z Tests that passing the ``optim_input`` argument into optimizer state ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 992 2022-09-27T15:47:15.3721450Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 993 2022-09-27T15:47:17.0615630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:17.0616146Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:17.0617246Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:17.0617723Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:17.0768545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:17.0769036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:17.0771580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:17.0772043Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:17.3238355Z dist init r=1, world=2 2022-09-27T15:47:17.3242293Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:47:17.3252787Z dist init r=0, world=2 2022-09-27T15:47:17.3258568Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:47:17.3259646Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:17.3345400Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:18.7025324Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:47:18.7025851Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:47:19.6806507Z ok (4.312s) 2022-09-27T15:47:19.6813136Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-09-27T15:47:19.6826229Z Tests :meth:`full_optim_state_dict` and `sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1077 2022-09-27T15:47:19.6832159Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1078 2022-09-27T15:47:21.3716849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:21.3717331Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:21.3718584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:21.3719065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:21.3764261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:21.3764694Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:21.3767661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:21.3768144Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:21.6183689Z dist init r=0, world=2 2022-09-27T15:47:21.6187572Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:47:21.6315596Z dist init r=1, world=2 2022-09-27T15:47:21.6320454Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:47:21.6321583Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:21.6391866Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:23.0032037Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:47:23.0032575Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:47:23.5662376Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:23.5662966Z warnings.warn( 2022-09-27T15:47:23.5663965Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:23.5664470Z warnings.warn( 2022-09-27T15:47:24.0917321Z ok (4.411s) 2022-09-27T15:47:24.0924119Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-09-27T15:47:24.0937010Z Tests :meth:`full_optim_state_dict` and `sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1162 2022-09-27T15:47:24.0943366Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1163 2022-09-27T15:47:25.7346045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:25.7346577Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:25.7347365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:25.7347838Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:25.7944800Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:25.7945278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:25.7946413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:25.7946880Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:25.9935682Z dist init r=1, world=2 2022-09-27T15:47:25.9939350Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:47:26.0386444Z dist init r=0, world=2 2022-09-27T15:47:26.0392194Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:47:26.0393028Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:26.0447981Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:27.3985782Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:47:27.3986294Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:47:27.9646912Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:27.9647493Z warnings.warn( 2022-09-27T15:47:27.9648210Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:27.9648725Z warnings.warn( 2022-09-27T15:47:28.5031896Z ok (4.411s) 2022-09-27T15:47:28.5040157Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-09-27T15:47:28.5054148Z Tests :meth:`full_optim_state_dict` and `sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1247 2022-09-27T15:47:28.5060483Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1248 2022-09-27T15:47:30.1909012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:30.1909783Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:30.1910398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:30.1911226Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:30.2060925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:30.2061383Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:30.2064252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:30.2064733Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:30.4486923Z dist init r=1, world=2 2022-09-27T15:47:30.4490710Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:47:30.4522885Z dist init r=0, world=2 2022-09-27T15:47:30.4528445Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:47:30.4529793Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:30.4594467Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:31.8194905Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:47:31.8195875Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:47:32.3882517Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:32.3883597Z warnings.warn( 2022-09-27T15:47:32.3884996Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:32.3885979Z warnings.warn( 2022-09-27T15:47:32.9148387Z ok (4.412s) 2022-09-27T15:47:32.9155689Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-09-27T15:47:32.9169445Z Tests :meth:`full_optim_state_dict` and `sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1332 2022-09-27T15:47:32.9175908Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1333 2022-09-27T15:47:34.6094340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:34.6094839Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:34.6095647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:34.6096123Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:34.6497211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:34.6497656Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:34.6500114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:34.6500602Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:34.8663578Z dist init r=0, world=2 2022-09-27T15:47:34.8667923Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:47:34.8969784Z dist init r=1, world=2 2022-09-27T15:47:34.8975393Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:47:34.8976184Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:34.9075369Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:36.2797684Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:47:36.2798216Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:47:36.8759019Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:36.8759865Z warnings.warn( 2022-09-27T15:47:36.8760611Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:36.8761103Z warnings.warn( 2022-09-27T15:47:37.4263629Z ok (4.511s) 2022-09-27T15:47:37.4270264Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-09-27T15:47:37.4284410Z Tests :meth:`full_optim_state_dict` and `sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1417 2022-09-27T15:47:37.4290714Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1418 2022-09-27T15:47:39.1068708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:39.1069247Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:39.1069836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:39.1070311Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:39.1427913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:39.1428385Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:39.1431232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:39.1431923Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:39.3631852Z dist init r=1, world=2 2022-09-27T15:47:39.3635619Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:47:39.3872618Z dist init r=0, world=2 2022-09-27T15:47:39.3878444Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:47:39.3879246Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:39.3940948Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:40.7698433Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:47:40.7698956Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:47:41.3594565Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:41.3595158Z warnings.warn( 2022-09-27T15:47:41.3596154Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:41.3596691Z warnings.warn( 2022-09-27T15:47:41.8377625Z ok (4.411s) 2022-09-27T15:47:41.8384624Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-09-27T15:47:41.8398371Z Tests :meth:`full_optim_state_dict` and `sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1502 2022-09-27T15:47:41.8404817Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1503 2022-09-27T15:47:43.4776257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:43.4777031Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:43.4780768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:43.4781286Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:43.4803661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:43.4804130Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:43.4807024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:43.4807510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:43.7144262Z dist init r=0, world=2 2022-09-27T15:47:43.7147609Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:47:43.7322317Z dist init r=1, world=2 2022-09-27T15:47:43.7327047Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:47:43.7328311Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:43.7352003Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:45.0733534Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:47:45.0734073Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:47:45.6561084Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:45.6561702Z warnings.warn( 2022-09-27T15:47:45.6563248Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:45.6563779Z warnings.warn( 2022-09-27T15:47:46.1489713Z ok (4.311s) 2022-09-27T15:47:46.1496326Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-09-27T15:47:46.1509586Z Tests :meth:`full_optim_state_dict` and `sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1587 2022-09-27T15:47:46.1516669Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1588 2022-09-27T15:47:47.7944354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:47.7945148Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:47.7945764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:47.7946694Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:47.7948323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:47.7948899Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:47.7952177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:47.7952650Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:48.0316272Z dist init r=0, world=2 2022-09-27T15:47:48.0319890Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:47:48.0476579Z dist init r=1, world=2 2022-09-27T15:47:48.0481253Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:47:48.0482282Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:48.0525298Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:49.3937516Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:47:49.3938068Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:47:49.9700942Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:49.9701545Z warnings.warn( 2022-09-27T15:47:49.9702532Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:49.9703138Z warnings.warn( 2022-09-27T15:47:50.4611946Z ok (4.312s) 2022-09-27T15:47:50.4618945Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-09-27T15:47:50.4631960Z Tests :meth:`full_optim_state_dict` and `sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1672 2022-09-27T15:47:50.4638370Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1673 2022-09-27T15:47:52.1316923Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:52.1317919Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:52.1319113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:52.1320057Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:52.1333230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:52.1334165Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:52.1335786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:52.1336754Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:52.3681308Z dist init r=0, world=2 2022-09-27T15:47:52.3684963Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:47:52.3937214Z dist init r=1, world=2 2022-09-27T15:47:52.3942765Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:47:52.3944182Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:52.3992777Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:53.7705298Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:47:53.7706279Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:47:54.3482746Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:54.3483890Z warnings.warn( 2022-09-27T15:47:54.3491084Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:54.3492486Z warnings.warn( 2022-09-27T15:47:54.8736356Z ok (4.412s) 2022-09-27T15:47:54.8742973Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-09-27T15:47:54.8756250Z Tests :meth:`full_optim_state_dict` and `sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1757 2022-09-27T15:47:54.8762686Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1758 2022-09-27T15:47:56.5196664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:56.5197174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:56.5198464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:56.5198967Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:56.5253011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:47:56.5253469Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:47:56.5256266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:47:56.5256742Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:47:56.7624935Z dist init r=1, world=2 2022-09-27T15:47:56.7628526Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:47:56.7763256Z dist init r=0, world=2 2022-09-27T15:47:56.7769403Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:47:56.7770804Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:56.7833847Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:47:58.1263078Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:47:58.1264183Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:47:58.7025431Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:58.7026488Z warnings.warn( 2022-09-27T15:47:58.7031543Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:47:58.7032851Z warnings.warn( 2022-09-27T15:47:59.1847304Z ok (4.311s) 2022-09-27T15:47:59.1854302Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-09-27T15:47:59.1877466Z Tests :meth:`full_optim_state_dict` and `sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1842 2022-09-27T15:47:59.1883470Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1843 2022-09-27T15:48:00.8149400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:00.8149912Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:00.8151043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:00.8151785Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:00.8454674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:00.8455134Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:00.8457804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:00.8458274Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:01.0759809Z dist init r=0, world=2 2022-09-27T15:48:01.0763798Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:48:01.0913001Z dist init r=1, world=2 2022-09-27T15:48:01.0918346Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:48:01.0919130Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:01.0968740Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:02.5024512Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:48:02.5025245Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:48:03.0982238Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:03.0982815Z warnings.warn( 2022-09-27T15:48:03.0983531Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:03.0984062Z warnings.warn( 2022-09-27T15:48:03.5972357Z ok (4.412s) 2022-09-27T15:48:03.5979153Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-09-27T15:48:03.5992375Z Tests :meth:`full_optim_state_dict` and `sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1927 2022-09-27T15:48:03.5999217Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1928 2022-09-27T15:48:05.3125952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:05.3126545Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:05.3127148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:05.3127620Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:05.3225729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:05.3226192Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:05.3228783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:05.3229333Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:05.5718923Z dist init r=0, world=2 2022-09-27T15:48:05.5722880Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:48:05.5772876Z dist init r=1, world=2 2022-09-27T15:48:05.5778141Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:48:05.5778989Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:05.5825762Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:06.9511704Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:48:06.9512308Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:48:07.4077668Z ok (3.810s) 2022-09-27T15:48:07.4083079Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-09-27T15:48:07.4096802Z Tests :meth:`full_optim_state_dict` and `sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2008 2022-09-27T15:48:07.4137286Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2009 2022-09-27T15:48:09.0695910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:09.0696413Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:09.0698590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:09.0699079Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:09.0720178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:09.0720631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:09.0723158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:09.0723641Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:09.3168173Z dist init r=1, world=2 2022-09-27T15:48:09.3171816Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:48:09.3318521Z dist init r=0, world=2 2022-09-27T15:48:09.3323932Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:48:09.3324868Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:09.3376834Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:10.7111245Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:48:10.7111984Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:48:11.2180281Z ok (3.810s) 2022-09-27T15:48:11.2187655Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-09-27T15:48:11.2202163Z Tests :meth:`full_optim_state_dict` and `sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2089 2022-09-27T15:48:11.2208605Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2090 2022-09-27T15:48:12.8842654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:12.8843170Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:12.8843927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:12.8844431Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:12.9265602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:12.9266244Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:12.9268857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:12.9269352Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:13.1391550Z dist init r=1, world=2 2022-09-27T15:48:13.1395752Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:48:13.1699035Z dist init r=0, world=2 2022-09-27T15:48:13.1704276Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:48:13.1705075Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:13.1802671Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:14.5450219Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:48:14.5450890Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:48:15.1804912Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:15.1805477Z warnings.warn( 2022-09-27T15:48:15.1806208Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:15.1806728Z warnings.warn( 2022-09-27T15:48:15.7298740Z ok (4.512s) 2022-09-27T15:48:15.7305843Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-09-27T15:48:15.7320148Z Tests :meth:`full_optim_state_dict` and `sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2174 2022-09-27T15:48:15.7326356Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2175 2022-09-27T15:48:17.4126880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:17.4127400Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:17.4129949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:17.4130440Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:17.4559172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:17.4559638Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:17.4562649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:17.4563267Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:17.6654413Z dist init r=0, world=2 2022-09-27T15:48:17.6658491Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:48:17.6997230Z dist init r=1, world=2 2022-09-27T15:48:17.7002679Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:48:17.7003488Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:17.7065091Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:19.0684123Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:48:19.0685033Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:48:19.6698993Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:19.6699552Z warnings.warn( 2022-09-27T15:48:19.6700678Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:19.6701184Z warnings.warn( 2022-09-27T15:48:20.2413490Z ok (4.511s) 2022-09-27T15:48:20.2420856Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-09-27T15:48:20.2435267Z Tests :meth:`full_optim_state_dict` and `sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2259 2022-09-27T15:48:20.2441621Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2260 2022-09-27T15:48:21.8982232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:21.8983148Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:21.8984206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:21.8984688Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:21.9161440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:21.9161901Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:21.9164873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:21.9165365Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:22.1699562Z dist init r=1, world=2 2022-09-27T15:48:22.1703301Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:48:22.1738511Z dist init r=0, world=2 2022-09-27T15:48:22.1744009Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:48:22.1745109Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:22.1806370Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:23.5485299Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:48:23.5485855Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:48:23.9518840Z ok (3.710s) 2022-09-27T15:48:23.9526558Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-09-27T15:48:23.9540039Z Tests :meth:`full_optim_state_dict` and `sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2340 2022-09-27T15:48:23.9546763Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2341 2022-09-27T15:48:25.5730352Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:25.5730868Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:25.5731903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:25.5732943Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:25.6083832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:25.6084515Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:25.6087513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:25.6088036Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:25.8289537Z dist init r=1, world=2 2022-09-27T15:48:25.8293081Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:48:25.8538934Z dist init r=0, world=2 2022-09-27T15:48:25.8543928Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:48:25.8544764Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:25.8599969Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:27.2288735Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:48:27.2289266Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:48:27.6620811Z ok (3.710s) 2022-09-27T15:48:27.6625761Z test_rekey_optim_state_dict_to_ids_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_False (__main__.TestFSDPOptimState) 2022-09-27T15:48:27.6640240Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2421 2022-09-27T15:48:27.6647096Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2422 2022-09-27T15:48:29.3104469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:29.3104956Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:29.3105986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:29.3106458Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:29.3569065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:29.3569513Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:29.3572255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:29.3572724Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:29.5639509Z dist init r=1, world=2 2022-09-27T15:48:29.5643423Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:48:29.5947396Z dist init r=0, world=2 2022-09-27T15:48:29.5952733Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:48:29.5954074Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:29.6050468Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:30.9551013Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:48:30.9551524Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:48:31.5671155Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:31.5671954Z warnings.warn( 2022-09-27T15:48:31.5674092Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:31.5674597Z warnings.warn( 2022-09-27T15:48:32.0741529Z ok (4.412s) 2022-09-27T15:48:32.0747524Z test_rekey_optim_state_dict_to_ids_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_True (__main__.TestFSDPOptimState) 2022-09-27T15:48:32.0762484Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2506 2022-09-27T15:48:32.0768973Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2507 2022-09-27T15:48:33.6998850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:33.6999379Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:33.7000333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:33.7000854Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:33.7257685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:33.7260742Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:33.7261305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:33.7261776Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:33.9570375Z dist init r=0, world=2 2022-09-27T15:48:33.9574197Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:48:33.9708491Z dist init r=1, world=2 2022-09-27T15:48:33.9713686Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:48:33.9714475Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:33.9778492Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:35.3504967Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:48:35.3505541Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:48:35.9650028Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:35.9650616Z warnings.warn( 2022-09-27T15:48:35.9653981Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:35.9654542Z warnings.warn( 2022-09-27T15:48:36.4861821Z ok (4.412s) 2022-09-27T15:48:36.4868089Z test_rekey_optim_state_dict_to_ids_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_False (__main__.TestFSDPOptimState) 2022-09-27T15:48:36.4881948Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2591 2022-09-27T15:48:36.4888340Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2592 2022-09-27T15:48:38.1570320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:38.1571214Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:38.1572097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:38.1572587Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:38.2069261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:38.2069741Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:38.2072655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:38.2073141Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:38.4123100Z dist init r=1, world=2 2022-09-27T15:48:38.4127032Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:48:38.4514125Z dist init r=0, world=2 2022-09-27T15:48:38.4519244Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:48:38.4520056Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:38.4532992Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:39.8584311Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:48:39.8584839Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:48:40.4454881Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:40.4455450Z warnings.warn( 2022-09-27T15:48:40.4456172Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:40.4456714Z warnings.warn( 2022-09-27T15:48:40.9977292Z ok (4.511s) 2022-09-27T15:48:40.9982905Z test_rekey_optim_state_dict_to_ids_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_True (__main__.TestFSDPOptimState) 2022-09-27T15:48:40.9996415Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2676 2022-09-27T15:48:41.0003062Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2677 2022-09-27T15:48:42.6848221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:42.6848733Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:42.6850009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:42.6850514Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:42.7108304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:42.7108786Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:42.7111732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:42.7112225Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:42.9487460Z dist init r=1, world=2 2022-09-27T15:48:42.9491770Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:48:42.9602246Z dist init r=0, world=2 2022-09-27T15:48:42.9607409Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:48:42.9608456Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:42.9697398Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:44.3533653Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:48:44.3534205Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:48:44.9335160Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:44.9335693Z warnings.warn( 2022-09-27T15:48:44.9339255Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:44.9339802Z warnings.warn( 2022-09-27T15:48:45.5091464Z ok (4.511s) 2022-09-27T15:48:45.5095213Z test_rekey_optim_state_dict_to_names (__main__.TestFSDPOptimState) 2022-09-27T15:48:45.5108931Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2761 2022-09-27T15:48:45.5115688Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2762 2022-09-27T15:48:47.1976108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:47.1976612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:47.1978647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:47.1979132Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:47.2291756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:47.2292218Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:47.2294719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:47.2295195Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:47.4556978Z dist init r=1, world=2 2022-09-27T15:48:47.4561011Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:48:47.4749532Z dist init r=0, world=2 2022-09-27T15:48:47.4754979Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:48:47.4755735Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:47.4766757Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:48.8573180Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:48:48.8573902Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:48:49.5082215Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:49.5082792Z warnings.warn( 2022-09-27T15:48:49.5087014Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:49.5087518Z warnings.warn( 2022-09-27T15:48:50.0202166Z ok (4.511s) 2022-09-27T15:48:50.0207155Z test_scatter_full_optim_state_dict_nested_halve_world_size (__main__.TestFSDPOptimState) 2022-09-27T15:48:50.0220920Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2846 2022-09-27T15:48:50.0226649Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2847 2022-09-27T15:48:51.6843155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:51.6843660Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:51.6844448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:51.6844913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:51.7197960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:51.7198439Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:51.7201276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:51.7201738Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:51.9432628Z dist init r=0, world=2 2022-09-27T15:48:51.9437021Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:48:51.9672503Z dist init r=1, world=2 2022-09-27T15:48:51.9678110Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:48:51.9678906Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:51.9742349Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:53.3408508Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:48:53.3409176Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:48:53.8479889Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-09-27T15:48:53.8483527Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-09-27T15:48:53.8484566Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T15:48:53.8580955Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T15:48:54.0231727Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:54.0232312Z warnings.warn( 2022-09-27T15:48:54.0233268Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:54.0233818Z warnings.warn( 2022-09-27T15:48:54.0269109Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-09-27T15:48:54.0275089Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-09-27T15:48:54.0275842Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T15:48:54.0371516Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T15:48:54.6314390Z ok (4.611s) 2022-09-27T15:48:54.6321743Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-09-27T15:48:54.6336173Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2943 2022-09-27T15:48:54.6342089Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2944 2022-09-27T15:48:56.2841962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:56.2842831Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:56.2843449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:56.2843932Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:56.3166252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:48:56.3166713Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:48:56.3169799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:48:56.3170289Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:48:56.5464735Z dist init r=0, world=2 2022-09-27T15:48:56.5468714Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:48:56.5562133Z dist init r=1, world=2 2022-09-27T15:48:56.5567522Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:48:56.5568297Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:56.5571418Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:48:57.9543270Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:48:57.9543843Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:48:58.6570436Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:58.6570989Z warnings.warn( 2022-09-27T15:48:58.6577392Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:48:58.6578187Z warnings.warn( 2022-09-27T15:48:59.3432091Z ok (4.712s) 2022-09-27T15:48:59.3439545Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-09-27T15:48:59.3454055Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3028 2022-09-27T15:48:59.3459829Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3029 2022-09-27T15:49:00.9663108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:00.9663624Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:00.9664529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:00.9665006Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:00.9979957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:00.9980425Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:00.9983142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:00.9983623Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:01.2236059Z dist init r=0, world=2 2022-09-27T15:49:01.2242398Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:49:01.2924831Z dist init r=1, world=2 2022-09-27T15:49:01.2930367Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:49:01.2931725Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:01.2953667Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:02.6488191Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:49:02.6488764Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:49:03.3309090Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:03.3309659Z warnings.warn( 2022-09-27T15:49:03.3313546Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:03.3314062Z warnings.warn( 2022-09-27T15:49:03.9548398Z ok (4.612s) 2022-09-27T15:49:03.9555081Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-09-27T15:49:03.9568576Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3113 2022-09-27T15:49:03.9574611Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3114 2022-09-27T15:49:05.5975364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:05.5976137Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:05.5977054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:05.5977567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:05.6901164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:05.6902067Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:05.6903213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:05.6903721Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:05.8515643Z dist init r=0, world=2 2022-09-27T15:49:05.8519010Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:49:05.9244080Z dist init r=1, world=2 2022-09-27T15:49:05.9249952Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:49:05.9251352Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:05.9331235Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:07.2869509Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:49:07.2871219Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:49:07.9465530Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:07.9466072Z warnings.warn( 2022-09-27T15:49:07.9466803Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:07.9467326Z warnings.warn( 2022-09-27T15:49:08.5664963Z ok (4.612s) 2022-09-27T15:49:08.5672200Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-09-27T15:49:08.5685987Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3198 2022-09-27T15:49:08.5691816Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3199 2022-09-27T15:49:10.1945197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:10.1945861Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:10.1947436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:10.1947963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:10.2579756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:10.2580259Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:10.2582276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:10.2582761Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:10.4473579Z dist init r=1, world=2 2022-09-27T15:49:10.4477764Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:49:10.5026707Z dist init r=0, world=2 2022-09-27T15:49:10.5032388Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:49:10.5033659Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:10.5086830Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:11.8889183Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:49:11.8889706Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:49:12.5390131Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:12.5390935Z warnings.warn( 2022-09-27T15:49:12.5391694Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:12.5392208Z warnings.warn( 2022-09-27T15:49:13.0780517Z ok (4.511s) 2022-09-27T15:49:13.0786809Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-09-27T15:49:13.0800601Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3283 2022-09-27T15:49:13.0806466Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3284 2022-09-27T15:49:14.7489932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:14.7490809Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:14.7491447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:14.7491931Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:14.7712711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:14.7713212Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:14.7716598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:14.7717083Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:15.0106317Z dist init r=0, world=2 2022-09-27T15:49:15.0110037Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:49:15.0211166Z dist init r=1, world=2 2022-09-27T15:49:15.0216822Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:49:15.0218032Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:15.0315056Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:16.4006103Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:49:16.4006686Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:49:17.0934449Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:17.0935046Z warnings.warn( 2022-09-27T15:49:17.0936694Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:17.0937216Z warnings.warn( 2022-09-27T15:49:17.6895650Z ok (4.611s) 2022-09-27T15:49:17.6901766Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-09-27T15:49:17.6906181Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/85092 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-09-27T15:49:17.6911931Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-09-27T15:49:17.6925232Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3368 2022-09-27T15:49:17.6931226Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3369 2022-09-27T15:49:19.3302452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:19.3302977Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:19.3304009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:19.3304504Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:19.3461389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:19.3461868Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:19.3464784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:19.3465254Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:19.5948178Z dist init r=1, world=2 2022-09-27T15:49:19.5951698Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:49:19.5970766Z dist init r=0, world=2 2022-09-27T15:49:19.5975908Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:49:19.5976888Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:19.6054528Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:20.9863261Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:49:20.9863958Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:49:21.6398827Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:21.6399417Z warnings.warn( 2022-09-27T15:49:21.6404458Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:21.6404960Z warnings.warn( 2022-09-27T15:49:22.2018510Z ok (4.511s) 2022-09-27T15:49:22.2024575Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-09-27T15:49:22.2039093Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3453 2022-09-27T15:49:22.2044741Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3454 2022-09-27T15:49:23.8778170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:23.8778696Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:23.8779270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:23.8779743Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:23.8792444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:23.8792919Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:23.8796478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:23.8796994Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:24.1243368Z dist init r=0, world=2 2022-09-27T15:49:24.1247058Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:49:24.1353747Z dist init r=1, world=2 2022-09-27T15:49:24.1358977Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:49:24.1360144Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:24.1451401Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:25.5187280Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:49:25.5187862Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:49:26.1911985Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:26.1912544Z warnings.warn( 2022-09-27T15:49:26.1918312Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:26.1918845Z warnings.warn( 2022-09-27T15:49:26.8133761Z ok (4.611s) 2022-09-27T15:49:26.8137947Z test_scatter_full_optim_state_dict_transformer (__main__.TestFSDPOptimState) 2022-09-27T15:49:26.8151481Z Tests :meth:`scatter_full_optim_state_dict` for an FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3538 2022-09-27T15:49:26.8157799Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3539 2022-09-27T15:49:28.4646600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:28.4647097Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:28.4648158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:28.4648641Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:28.4754706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:28.4755144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:28.4757602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:28.4758094Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:28.7243067Z dist init r=1, world=2 2022-09-27T15:49:28.7244175Z dist init r=0, world=2 2022-09-27T15:49:28.7247330Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:49:28.7249348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:49:28.7250101Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:28.7350413Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:30.0942455Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:49:30.0943018Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:49:30.8009643Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-09-27T15:49:30.8012585Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-09-27T15:49:30.8013460Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T15:49:30.8110767Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T15:49:31.0610098Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-09-27T15:49:31.0613558Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-09-27T15:49:31.0614578Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T15:49:31.0711399Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T15:49:31.1207566Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T15:49:31.1208856Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T15:49:31.1210100Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T15:49:31.1211346Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T15:49:31.7254782Z ok (4.912s) 2022-09-27T15:49:31.7259401Z test_shard_full_optim_state_dict_nested_halve_world_size (__main__.TestFSDPOptimState) 2022-09-27T15:49:31.7273115Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3635 2022-09-27T15:49:31.7279939Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3636 2022-09-27T15:49:33.3960328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:33.3960851Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:33.3961455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:33.3961921Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:33.3974777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:33.3975224Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:33.3978039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:33.3980427Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:33.6473636Z dist init r=1, world=2 2022-09-27T15:49:33.6477346Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:49:33.6535451Z dist init r=0, world=2 2022-09-27T15:49:33.6540444Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:49:33.6541271Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:33.6580132Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:35.0283179Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:49:35.0283698Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:49:35.5254237Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-09-27T15:49:35.5257562Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-09-27T15:49:35.5258355Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T15:49:35.5355045Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T15:49:35.6900729Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:35.6901286Z warnings.warn( 2022-09-27T15:49:35.6903412Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:35.6903947Z warnings.warn( 2022-09-27T15:49:35.6940405Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-09-27T15:49:35.6955011Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-09-27T15:49:35.6955679Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T15:49:35.7043470Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T15:49:36.3370470Z ok (4.611s) 2022-09-27T15:49:36.3376757Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-09-27T15:49:36.3390505Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3732 2022-09-27T15:49:36.3396509Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3733 2022-09-27T15:49:38.0348433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:38.0348957Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:38.0349887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:38.0350378Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:38.0568187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:38.0568660Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:38.0571722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:38.0572207Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:38.2928637Z dist init r=0, world=2 2022-09-27T15:49:38.2932832Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:49:38.3070219Z dist init r=1, world=2 2022-09-27T15:49:38.3075360Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:49:38.3076138Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:38.3137534Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:39.6773167Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:49:39.6773814Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:49:40.3820321Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:40.3821157Z warnings.warn( 2022-09-27T15:49:40.3824500Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:40.3825007Z warnings.warn( 2022-09-27T15:49:41.0488705Z ok (4.712s) 2022-09-27T15:49:41.0495260Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-09-27T15:49:41.0509505Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3817 2022-09-27T15:49:41.0516446Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3818 2022-09-27T15:49:42.7070227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:42.7071002Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:42.7072100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:42.7072630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:42.7303547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:42.7304020Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:42.7306622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:42.7307115Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:42.9664933Z dist init r=1, world=2 2022-09-27T15:49:42.9668458Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:49:42.9759088Z dist init r=0, world=2 2022-09-27T15:49:42.9764604Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:49:42.9765377Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:42.9770811Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:44.3629898Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:49:44.3630590Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:49:45.0401132Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:45.0401711Z warnings.warn( 2022-09-27T15:49:45.0403188Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:45.0403765Z warnings.warn( 2022-09-27T15:49:45.6621865Z ok (4.613s) 2022-09-27T15:49:45.6628159Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-09-27T15:49:45.6642356Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3902 2022-09-27T15:49:45.6648244Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3903 2022-09-27T15:49:47.3515747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:47.3516505Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:47.3517762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:47.3518242Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:47.3582070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:47.3582515Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:47.3585463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:47.3585943Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:47.6007225Z dist init r=1, world=2 2022-09-27T15:49:47.6011256Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:49:47.6120413Z dist init r=0, world=2 2022-09-27T15:49:47.6125319Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:49:47.6126353Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:47.6215512Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:48.9707025Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:49:48.9707539Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:49:49.5968899Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:49.5969463Z warnings.warn( 2022-09-27T15:49:49.5971510Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:49.5972008Z warnings.warn( 2022-09-27T15:49:50.1736119Z ok (4.511s) 2022-09-27T15:49:50.1742414Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-09-27T15:49:50.1756527Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3987 2022-09-27T15:49:50.1762647Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3988 2022-09-27T15:49:51.8331574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:51.8332081Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:51.8332895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:51.8333595Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:51.8996242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:51.8996770Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:51.8997784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:51.8998260Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:52.1000707Z dist init r=1, world=2 2022-09-27T15:49:52.1005027Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:49:52.1327996Z dist init r=0, world=2 2022-09-27T15:49:52.1333356Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:49:52.1334152Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:52.1411772Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:53.5088551Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:49:53.5089083Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:49:54.1600379Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:54.1600954Z warnings.warn( 2022-09-27T15:49:54.1605033Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:54.1605585Z warnings.warn( 2022-09-27T15:49:54.7852122Z ok (4.611s) 2022-09-27T15:49:54.7858263Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-09-27T15:49:54.7871745Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4072 2022-09-27T15:49:54.7877954Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4073 2022-09-27T15:49:56.4727622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:56.4728129Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:56.4729423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:56.4729965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:56.4891329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:49:56.4891806Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:49:56.4894842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:49:56.4895328Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:49:56.7349808Z dist init r=0, world=2 2022-09-27T15:49:56.7353731Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:49:56.7377249Z dist init r=1, world=2 2022-09-27T15:49:56.7382433Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:49:56.7383239Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:56.7457124Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:49:58.1018687Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:49:58.1019207Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:49:58.8038780Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:58.8039512Z warnings.warn( 2022-09-27T15:49:58.8041751Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:49:58.8042553Z warnings.warn( 2022-09-27T15:49:59.3966993Z ok (4.611s) 2022-09-27T15:49:59.3972261Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-09-27T15:49:59.3985651Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4157 2022-09-27T15:49:59.3991629Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4158 2022-09-27T15:50:01.0589846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:01.0590493Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:01.0591686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:01.0592186Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:01.0844304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:01.0844750Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:01.0847430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:01.0847911Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:01.3206466Z dist init r=1, world=2 2022-09-27T15:50:01.3210531Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:50:01.3315456Z dist init r=0, world=2 2022-09-27T15:50:01.3320741Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:50:01.3321817Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:50:01.3414791Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:50:02.7213598Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:50:02.7214139Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:50:03.4028911Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:50:03.4029478Z warnings.warn( 2022-09-27T15:50:03.4030195Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:50:03.4031086Z warnings.warn( 2022-09-27T15:50:04.0080337Z ok (4.611s) 2022-09-27T15:50:04.0087022Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-09-27T15:50:04.0100290Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4242 2022-09-27T15:50:04.0106240Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4243 2022-09-27T15:50:05.6729336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:05.6729845Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:05.6730452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:05.6730921Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:05.6967746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:05.6968204Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:05.6970965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:05.6971454Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:05.9362815Z dist init r=1, world=2 2022-09-27T15:50:05.9367927Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:50:05.9474565Z dist init r=0, world=2 2022-09-27T15:50:05.9479677Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:50:05.9480465Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:50:05.9571867Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:50:07.3138519Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:50:07.3139052Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:50:07.9423197Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:50:07.9423777Z warnings.warn( 2022-09-27T15:50:07.9428959Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:50:07.9429477Z warnings.warn( 2022-09-27T15:50:08.5196555Z ok (4.512s) 2022-09-27T15:50:08.5203278Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-09-27T15:50:08.5217414Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4327 2022-09-27T15:50:08.5223556Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4328 2022-09-27T15:50:10.1973229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:10.1973746Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:10.1974569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:10.1975055Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:10.2158635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:10.2159119Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:10.2162660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:10.2163178Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:10.4627138Z dist init r=1, world=2 2022-09-27T15:50:10.4631076Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:50:10.4696787Z dist init r=0, world=2 2022-09-27T15:50:10.4702333Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:50:10.4703268Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:50:10.4733890Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:50:11.8379138Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:50:11.8379826Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:50:12.4969094Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:50:12.4969677Z warnings.warn( 2022-09-27T15:50:12.4974504Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:50:12.4975029Z warnings.warn( 2022-09-27T15:50:13.0312587Z ok (4.511s) 2022-09-27T15:50:13.0317238Z test_shard_full_optim_state_dict_transformer (__main__.TestFSDPOptimState) 2022-09-27T15:50:13.0330821Z Tests :meth:`shard_full_optim_state_dict` for an FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4412 2022-09-27T15:50:13.0336890Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4413 2022-09-27T15:50:14.6649010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:14.6649531Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:14.6650327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:14.6650798Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:14.6902309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:14.6902772Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:14.6906358Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:14.6906840Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:14.9253892Z dist init r=1, world=2 2022-09-27T15:50:14.9257697Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:50:14.9354809Z dist init r=0, world=2 2022-09-27T15:50:14.9360056Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:50:14.9360845Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:50:14.9361544Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:50:16.3343256Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:50:16.3343812Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:50:17.0462423Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-09-27T15:50:17.0465572Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-09-27T15:50:17.0466372Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T15:50:17.0563042Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T15:50:17.3379667Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-09-27T15:50:17.3384055Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-09-27T15:50:17.3384826Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T15:50:17.3410028Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T15:50:17.3411297Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T15:50:17.3412707Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T15:50:17.3413932Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T15:50:17.3480813Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T15:50:18.0434690Z ok (5.012s) 2022-09-27T15:50:18.0441112Z test_shard_full_optim_state_dict_unmanaged_params_state_dict_type_StateDictType_FULL_STATE_DICT_add_to_fsdp_module_False (__main__.TestFSDPOptimState) 2022-09-27T15:50:18.0453456Z Tests :meth:`shard_full_optim_state_dict` when there are unmanaged ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4509 2022-09-27T15:50:18.0459454Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4510 2022-09-27T15:50:19.7483397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:19.7483892Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:19.7485258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:19.7485733Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:19.7660027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:19.7660475Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:19.7663383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:19.7663861Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:20.0193483Z dist init r=0, world=2 2022-09-27T15:50:20.0193764Z dist init r=1, world=2 2022-09-27T15:50:20.0198285Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:50:20.0198820Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:50:20.0199594Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:50:20.0200266Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:50:21.4146365Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:50:21.4147075Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:50:21.9001585Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T15:50:21.9003182Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T15:50:21.9240721Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:50:21.9241267Z warnings.warn( 2022-09-27T15:50:21.9351308Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:50:21.9352070Z warnings.warn( 2022-09-27T15:50:22.4544822Z ok (4.411s) 2022-09-27T15:50:22.4550475Z test_shard_full_optim_state_dict_unmanaged_params_state_dict_type_StateDictType_FULL_STATE_DICT_add_to_fsdp_module_True (__main__.TestFSDPOptimState) 2022-09-27T15:50:22.4564706Z Tests :meth:`shard_full_optim_state_dict` when there are unmanaged ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4594 2022-09-27T15:50:22.4571106Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4595 2022-09-27T15:50:24.0827757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:24.0828253Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:24.0829143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:24.0829644Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:24.0886244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:24.0886700Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:24.0889908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:24.0890389Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:24.3269904Z dist init r=0, world=2 2022-09-27T15:50:24.3273614Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:50:24.3432100Z dist init r=1, world=2 2022-09-27T15:50:24.3437262Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:50:24.3438029Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:50:24.3479043Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:50:25.7053923Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:50:25.7054642Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:50:26.1807120Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T15:50:26.1808443Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T15:50:26.2036134Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:50:26.2036688Z warnings.warn( 2022-09-27T15:50:26.2176043Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:50:26.2176568Z warnings.warn( 2022-09-27T15:50:26.6653676Z ok (4.211s) 2022-09-27T15:50:26.6658964Z test_shard_full_optim_state_dict_unmanaged_params_state_dict_type_StateDictType_SHARDED_STATE_DICT_add_to_fsdp_module_False (__main__.TestFSDPOptimState) 2022-09-27T15:50:26.6672216Z Tests :meth:`shard_full_optim_state_dict` when there are unmanaged ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4679 2022-09-27T15:50:26.6678513Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4680 2022-09-27T15:50:28.3443336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:28.3443849Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:28.3444441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:28.3444898Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:28.3694248Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:28.3694752Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:28.3698120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:28.3698597Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:28.6101681Z dist init r=0, world=2 2022-09-27T15:50:28.6106299Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:50:28.6156716Z dist init r=1, world=2 2022-09-27T15:50:28.6161938Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:50:28.6162732Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:50:28.6209152Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:50:30.0053563Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:50:30.0054127Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:50:30.5450534Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:50:30.5451117Z warnings.warn( 2022-09-27T15:50:30.5452190Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:50:30.5452675Z warnings.warn( 2022-09-27T15:50:31.0762992Z ok (4.411s) 2022-09-27T15:50:31.0768336Z test_shard_full_optim_state_dict_unmanaged_params_state_dict_type_StateDictType_SHARDED_STATE_DICT_add_to_fsdp_module_True (__main__.TestFSDPOptimState) 2022-09-27T15:50:31.0781813Z Tests :meth:`shard_full_optim_state_dict` when there are unmanaged ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4764 2022-09-27T15:50:31.0787926Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4765 2022-09-27T15:50:32.7105438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:32.7105972Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:32.7106897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:32.7107372Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:32.7375789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:32.7376256Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:32.7379888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:32.7380376Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:32.9695184Z dist init r=1, world=2 2022-09-27T15:50:32.9699232Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T15:50:32.9839949Z dist init r=0, world=2 2022-09-27T15:50:32.9845135Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T15:50:32.9846042Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:50:32.9903641Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T15:50:34.3575437Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:50:34.3576083Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:50:34.8912602Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:50:34.8913552Z warnings.warn( 2022-09-27T15:50:34.8915087Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:3652: UserWarning: The `optim_input` argument is deprecated. You may remove it from your code without changing its functionality. 2022-09-27T15:50:34.8915613Z warnings.warn( 2022-09-27T15:50:35.3872236Z ok (4.311s) 2022-09-27T15:50:35.3872583Z 2022-09-27T15:50:35.3873365Z ---------------------------------------------------------------------- 2022-09-27T15:50:35.3873810Z Ran 50 tests in 219.798s 2022-09-27T15:50:35.3873982Z 2022-09-27T15:50:35.3874099Z OK (skipped=1) 2022-09-27T15:50:35.3874258Z 2022-09-27T15:50:35.3874367Z Generating XML reports... 2022-09-27T15:50:35.3973246Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_optim_state/TEST-TestFSDPOptimState-20220927154655.xml 2022-09-27T15:50:35.7492494Z Running distributed/fsdp/test_checkpoint_wrapper ... [2022-09-27 15:50:35.748720] 2022-09-27T15:50:35.7493298Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_checkpoint_wrapper.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 15:50:35.748796] 2022-09-27T15:50:37.6015403Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_checkpoint_wrapper 2022-09-27T15:50:37.6031862Z 2022-09-27T15:50:37.6032313Z Running tests... 2022-09-27T15:50:37.6032811Z ---------------------------------------------------------------------- 2022-09-27T15:50:37.6057155Z test_apply_activation_checkpointing_wrapper (__main__.CheckpointWrapperTest) 2022-09-27T15:50:39.2420889Z Ensures that `apply_activation_checkpointing_wrapper` can be used ... ok (1.638s) 2022-09-27T15:50:39.6540593Z test_checkpoint_wrapper_cpu_offload (__main__.CheckpointWrapperTest) ... ok (0.412s) 2022-09-27T15:50:39.6630959Z test_checkpoint_wrapper_kwarg_support (__main__.CheckpointWrapperTest) ... ok (0.009s) 2022-09-27T15:50:39.6651197Z test_checkpoint_wrapper_parity (__main__.CheckpointWrapperTest) 2022-09-27T15:50:39.6654709Z Tests that using checkpoint_wrapper or the functional ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/79510 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.002s) 2022-09-27T15:50:39.6668487Z test_forward_missing_attributes (__main__.CheckpointWrapperTest) ... ok (0.001s) 2022-09-27T15:50:39.6680271Z test_fqn (__main__.CheckpointWrapperTest) ... ok (0.001s) 2022-09-27T15:50:39.6711146Z test_load_activation_checkpointed_module (__main__.CheckpointWrapperTest) ... ok (0.003s) 2022-09-27T15:50:39.6711680Z 2022-09-27T15:50:39.6712287Z ---------------------------------------------------------------------- 2022-09-27T15:50:39.6713103Z Ran 7 tests in 2.068s 2022-09-27T15:50:39.6713466Z 2022-09-27T15:50:39.6713693Z OK (skipped=1) 2022-09-27T15:50:39.6714007Z 2022-09-27T15:50:39.6714212Z Generating XML reports... 2022-09-27T15:50:39.6769570Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_checkpoint_wrapper/TEST-CheckpointWrapperTest-20220927155037.xml 2022-09-27T15:50:40.0716697Z Running distributed/rpc/cuda/test_tensorpipe_agent ... [2022-09-27 15:50:40.071127] 2022-09-27T15:50:40.0717672Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/rpc/cuda/test_tensorpipe_agent.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 15:50:40.071205] 2022-09-27T15:50:41.9263489Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6xb1puqg 2022-09-27T15:50:41.9264692Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6xb1puqg/_remote_module_non_scriptable.py 2022-09-27T15:50:42.3762128Z ]> 2022-09-27T15:50:42.3763035Z test_ddp_dist_autograd_local_vs_remote_gpu (__main__.TensorPipeCudaDdpComparisonTest) 2022-09-27T15:50:42.3763837Z , <__main__.TensorPipeCudaDistAutogradTest testMethod=test_gpu_to_cpu_continuation>, <__main__.TensorPipeCudaDistAutogradTest testMethod=test_gpu_to_cpu_continuation_gpu_root>]> 2022-09-27T15:50:42.3764650Z test_gpu_simple (__main__.TensorPipeCudaDistAutogradTest) 2022-09-27T15:50:42.3765063Z test_gpu_to_cpu_continuation (__main__.TensorPipeCudaDistAutogradTest) 2022-09-27T15:50:42.3765524Z test_gpu_to_cpu_continuation_gpu_root (__main__.TensorPipeCudaDistAutogradTest) 2022-09-27T15:50:42.3766784Z , <__main__.TensorPipeCudaRemoteModuleTest testMethod=test_input_moved_to_cuda_device_script>, <__main__.TensorPipeCudaRemoteModuleTest testMethod=test_invalid_devices>, <__main__.TensorPipeCudaRemoteModuleTest testMethod=test_valid_device>]> 2022-09-27T15:50:42.3768294Z test_input_moved_to_cuda_device (__main__.TensorPipeCudaRemoteModuleTest) 2022-09-27T15:50:42.3769337Z test_input_moved_to_cuda_device_script (__main__.TensorPipeCudaRemoteModuleTest) 2022-09-27T15:50:42.3770160Z test_invalid_devices (__main__.TensorPipeCudaRemoteModuleTest) 2022-09-27T15:50:42.3770771Z test_valid_device (__main__.TensorPipeCudaRemoteModuleTest) 2022-09-27T15:50:42.3771254Z ]> 2022-09-27T15:50:42.3771698Z test_profiler_remote_cuda (__main__.TensorPipeCudaRpcTest) 2022-09-27T15:50:42.3772990Z , <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_gloo_ckpt_except_last>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_gloo_ckpt_never>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_gloo_ckpt_never_find_unused>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_nccl_ckpt_always>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_nccl_ckpt_except_last>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_nccl_ckpt_never>, <__main__.TensorPipePipeWithDDPTest testMethod=test_basic_nccl_ckpt_never_find_unused>]> 2022-09-27T15:50:42.3774692Z test_basic_gloo_ckpt_always (__main__.TensorPipePipeWithDDPTest) 2022-09-27T15:50:42.3775264Z test_basic_gloo_ckpt_except_last (__main__.TensorPipePipeWithDDPTest) 2022-09-27T15:50:42.3776141Z test_basic_gloo_ckpt_never (__main__.TensorPipePipeWithDDPTest) 2022-09-27T15:50:42.3776909Z test_basic_gloo_ckpt_never_find_unused (__main__.TensorPipePipeWithDDPTest) 2022-09-27T15:50:42.3777887Z test_basic_nccl_ckpt_always (__main__.TensorPipePipeWithDDPTest) 2022-09-27T15:50:42.3778487Z test_basic_nccl_ckpt_except_last (__main__.TensorPipePipeWithDDPTest) 2022-09-27T15:50:42.3778916Z test_basic_nccl_ckpt_never (__main__.TensorPipePipeWithDDPTest) 2022-09-27T15:50:42.3779390Z test_basic_nccl_ckpt_never_find_unused (__main__.TensorPipePipeWithDDPTest) 2022-09-27T15:50:42.3794592Z , <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_async_execution_with_cuda_future>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_callback_changes_devices>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_cuda_sparse_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_cuda_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_custom_class_with_cuda_sparse_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_custom_class_with_cuda_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_list_with_cuda_sparse_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_can_extract_list_with_cuda_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_device_as_device>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_device_as_int>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_device_as_str>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_device_not_cuda>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_modify_tensor_inplace>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_replace_tensor>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_cuda_future_value_on_bad_device>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_custom_stream>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_custom_stream_multi>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_custom_stream_nested>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_custom_stream_nested_multi>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_cpu>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_cpu_to_gpu_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_cpu_to_gpu_non_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_default_to_non_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_5>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_6>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_7>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_8>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_5>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_6>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_7>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_mixed_self_8>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_non_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_non_default_to_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_to_cpu_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_map_gpu_to_cpu_non_default>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_gpu>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_in_options>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_invalid_max_local_device>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_invalid_max_remote_device>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_invalid_min_device>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_many_to_one>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_loop>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_not_timeout>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_remote>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_remote_response>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_response>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_missing_config_response_loop>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_multi_gpu>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_multi_gpu_self>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_one_to_many>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_remote>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_return_to_gpu>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_return_to_gpu_self>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_maps_wrong_worker_name>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_device_mismatch>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_devices_option_mismatch>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_devices_option_mismatch_reverse>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_owner_rref_forward_synchronization1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_owner_rref_forward_synchronization2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_owner_rref_forward_synchronization3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_owner_rref_forward_synchronization4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_as_arg_synchronization1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_as_arg_synchronization2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_as_arg_synchronization3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_as_arg_synchronization4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_as_arg_synchronization5>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_forward_synchronization1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_forward_synchronization2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_forward_synchronization3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_forward_synchronization4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_to_here_synchronization1>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_to_here_synchronization2>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_to_here_synchronization3>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_to_here_synchronization4>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_rref_with_unpickleable_attributes>, <__main__.TensorPipeTensorPipeAgentCudaRpcTest testMethod=test_tensor_view_as_return_value>]> 2022-09-27T15:50:42.3808444Z test_async_execution_nested_with_cuda_future (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3808970Z test_async_execution_with_cuda_future (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3809484Z test_cuda_future_callback_changes_devices (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3810007Z test_cuda_future_can_extract_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3810515Z test_cuda_future_can_extract_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3811056Z test_cuda_future_can_extract_custom_class_with_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3811617Z test_cuda_future_can_extract_custom_class_with_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3812169Z test_cuda_future_can_extract_list_with_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3812697Z test_cuda_future_can_extract_list_with_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3813205Z test_cuda_future_device_as_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3813686Z test_cuda_future_device_as_int (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3814147Z test_cuda_future_device_as_str (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3814624Z test_cuda_future_device_not_cuda (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3815122Z test_cuda_future_modify_tensor_inplace (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3815667Z test_cuda_future_replace_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3816149Z test_cuda_future_value_on_bad_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3816619Z test_custom_stream (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3817072Z test_custom_stream_multi (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3817519Z test_custom_stream_nested (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3817991Z test_custom_stream_nested_multi (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3818454Z test_device_map_cpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3818928Z test_device_map_cpu_to_gpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3819467Z test_device_map_cpu_to_gpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3819959Z test_device_map_gpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3820458Z test_device_map_gpu_default_to_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3820931Z test_device_map_gpu_mixed_1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3821397Z test_device_map_gpu_mixed_2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3821857Z test_device_map_gpu_mixed_3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3822320Z test_device_map_gpu_mixed_4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3822758Z test_device_map_gpu_mixed_5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3823224Z test_device_map_gpu_mixed_6 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3823680Z test_device_map_gpu_mixed_7 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3824129Z test_device_map_gpu_mixed_8 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3824605Z test_device_map_gpu_mixed_self_1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3825082Z test_device_map_gpu_mixed_self_2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3825541Z test_device_map_gpu_mixed_self_3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3826012Z test_device_map_gpu_mixed_self_4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3826478Z test_device_map_gpu_mixed_self_5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3826945Z test_device_map_gpu_mixed_self_6 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3827396Z test_device_map_gpu_mixed_self_7 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3827861Z test_device_map_gpu_mixed_self_8 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3828343Z test_device_map_gpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3828826Z test_device_map_gpu_non_default_to_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3829330Z test_device_map_gpu_to_cpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3829829Z test_device_map_gpu_to_cpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3830303Z test_device_maps_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3831074Z test_device_maps_in_options (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3831582Z test_device_maps_invalid_max_local_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3832091Z test_device_maps_invalid_max_remote_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3832581Z test_device_maps_invalid_min_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3833068Z test_device_maps_many_to_one (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3833546Z test_device_maps_missing_config (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3834108Z test_device_maps_missing_config_loop (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3834612Z test_device_maps_missing_config_not_timeout (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3835118Z test_device_maps_missing_config_remote (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3835633Z test_device_maps_missing_config_remote_response (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3836150Z test_device_maps_missing_config_response (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3836650Z test_device_maps_missing_config_response_loop (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3837141Z test_device_maps_multi_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3837683Z test_device_maps_multi_gpu_self (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3838143Z test_device_maps_one_to_many (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3838606Z test_device_maps_remote (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3839078Z test_device_maps_return_to_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3839566Z test_device_maps_return_to_gpu_self (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3840128Z test_device_maps_wrong_worker_name (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3840597Z test_device_mismatch (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3841066Z test_devices_option_mismatch (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3841536Z test_devices_option_mismatch_reverse (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3842042Z test_owner_rref_forward_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3842564Z test_owner_rref_forward_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3843080Z test_owner_rref_forward_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3843572Z test_owner_rref_forward_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3844072Z test_rref_as_arg_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3844561Z test_rref_as_arg_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3845031Z test_rref_as_arg_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3845545Z test_rref_as_arg_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3846024Z test_rref_as_arg_synchronization5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3846517Z test_rref_forward_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3847007Z test_rref_forward_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3847504Z test_rref_forward_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3848000Z test_rref_forward_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3848474Z test_rref_to_here_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3848966Z test_rref_to_here_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3849458Z test_rref_to_here_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3849947Z test_rref_to_here_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3850433Z test_rref_with_unpickleable_attributes (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3850930Z test_tensor_view_as_return_value (__main__.TensorPipeTensorPipeAgentCudaRpcTest) 2022-09-27T15:50:42.3851879Z , <__main__.TensorPipeTensorPipeCudaDistAutogradTest testMethod=test_dist_autograd_sync_streams>, <__main__.TensorPipeTensorPipeCudaDistAutogradTest testMethod=test_gradients_synchronizations>]> 2022-09-27T15:50:42.3852782Z test_device_maps_backward_pass (__main__.TensorPipeTensorPipeCudaDistAutogradTest) 2022-09-27T15:50:42.3853278Z test_dist_autograd_sync_streams (__main__.TensorPipeTensorPipeCudaDistAutogradTest) 2022-09-27T15:50:42.3853795Z test_gradients_synchronizations (__main__.TensorPipeTensorPipeCudaDistAutogradTest) 2022-09-27T15:50:43.9702811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:43.9703318Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:43.9705683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:43.9706470Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:44.2005643Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu8c0eutk 2022-09-27T15:50:44.2006739Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu8c0eutk/_remote_module_non_scriptable.py 2022-09-27T15:50:44.6237363Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:50:44.6252496Z 2022-09-27T15:50:44.6252913Z Running tests... 2022-09-27T15:50:44.6253398Z ---------------------------------------------------------------------- 2022-09-27T15:50:46.0761514Z test_ddp_dist_autograd_local_vs_remote_gpu (__main__.TensorPipeCudaDdpComparisonTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:50:46.0940036Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4968 2022-09-27T15:50:46.0946012Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4969 2022-09-27T15:50:46.0952498Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 4970 2022-09-27T15:50:46.0959646Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 4971 2022-09-27T15:50:47.7266628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:47.7267132Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:47.7268564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:47.7269029Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:47.7363798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:47.7364258Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:47.7367661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:47.7368133Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:47.7402253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:47.7402712Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:47.7406343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:47.7406801Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:47.7725599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:47.7726057Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:47.7729643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:47.7730361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:47.9660892Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjizbknpn 2022-09-27T15:50:47.9662206Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjizbknpn/_remote_module_non_scriptable.py 2022-09-27T15:50:47.9717493Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfb37tglu 2022-09-27T15:50:47.9720271Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfb37tglu/_remote_module_non_scriptable.py 2022-09-27T15:50:47.9757873Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8hrjtllw 2022-09-27T15:50:47.9760578Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8hrjtllw/_remote_module_non_scriptable.py 2022-09-27T15:50:48.0010555Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2hjo9_v2 2022-09-27T15:50:48.0013612Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2hjo9_v2/_remote_module_non_scriptable.py 2022-09-27T15:50:48.4185332Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:50:48.4191653Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:50:48.4194174Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:50:48.4480348Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:50:48.9037520Z skip: Need at least 4 CUDA devices (4.278s) 2022-09-27T15:50:48.9037778Z 2022-09-27T15:50:48.9038291Z ---------------------------------------------------------------------- 2022-09-27T15:50:48.9038765Z Ran 1 test in 4.278s 2022-09-27T15:50:48.9038932Z 2022-09-27T15:50:48.9039044Z OK (skipped=1) 2022-09-27T15:50:48.9039231Z 2022-09-27T15:50:48.9039361Z Generating XML reports... 2022-09-27T15:50:48.9073992Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDdpComparisonTest-20220927155044.xml 2022-09-27T15:50:50.8715004Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:50.8715529Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:50.8718073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:50.8718559Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:51.1077334Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppkqrimbk 2022-09-27T15:50:51.1079522Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppkqrimbk/_remote_module_non_scriptable.py 2022-09-27T15:50:51.5458316Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:50:51.5474587Z 2022-09-27T15:50:51.5474773Z Running tests... 2022-09-27T15:50:51.5475235Z ---------------------------------------------------------------------- 2022-09-27T15:50:53.0333535Z test_gpu_simple (__main__.TensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:50:53.0517226Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5139 2022-09-27T15:50:53.0524133Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5140 2022-09-27T15:50:53.0530998Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 5141 2022-09-27T15:50:53.0538179Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 5142 2022-09-27T15:50:54.6389438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:54.6390418Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:54.6391963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:54.6393308Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:54.6523392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:54.6523873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:54.6527125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:54.6527600Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:54.6610643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:54.6611103Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:54.6614335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:54.6614805Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:54.6663415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:54.6663873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:54.6668258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:54.6668716Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:50:54.8872069Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoq9gxu3o 2022-09-27T15:50:54.8872958Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoq9gxu3o/_remote_module_non_scriptable.py 2022-09-27T15:50:54.8909981Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx99eep74 2022-09-27T15:50:54.8912784Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx99eep74/_remote_module_non_scriptable.py 2022-09-27T15:50:54.8945699Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4fp_znqa 2022-09-27T15:50:54.8948628Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4fp_znqa/_remote_module_non_scriptable.py 2022-09-27T15:50:54.9017108Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkysbz058 2022-09-27T15:50:54.9020345Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkysbz058/_remote_module_non_scriptable.py 2022-09-27T15:50:55.3356527Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:50:55.3378718Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:50:55.3405161Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:50:55.3560404Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:50:55.4574609Z fi_getinfo: -61 2022-09-27T15:50:55.4592073Z fi_getinfo: -61 2022-09-27T15:50:55.4617952Z fi_getinfo: -61 2022-09-27T15:50:55.4776228Z fi_getinfo: -61 2022-09-27T15:50:57.9654440Z ok (6.418s) 2022-09-27T15:50:57.9654672Z 2022-09-27T15:50:57.9655067Z ---------------------------------------------------------------------- 2022-09-27T15:50:57.9655387Z Ran 1 test in 6.418s 2022-09-27T15:50:57.9655552Z 2022-09-27T15:50:57.9655650Z OK 2022-09-27T15:50:57.9655786Z 2022-09-27T15:50:57.9655921Z Generating XML reports... 2022-09-27T15:50:57.9692264Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20220927155051.xml 2022-09-27T15:50:59.9526792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:50:59.9527326Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:50:59.9529839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:50:59.9530349Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:00.1890614Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9xcg6m4q 2022-09-27T15:51:00.1891915Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9xcg6m4q/_remote_module_non_scriptable.py 2022-09-27T15:51:00.6370009Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:51:00.6386044Z 2022-09-27T15:51:00.6386365Z Running tests... 2022-09-27T15:51:00.6386817Z ---------------------------------------------------------------------- 2022-09-27T15:51:02.1350583Z test_gpu_to_cpu_continuation (__main__.TensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:51:02.1535166Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5562 2022-09-27T15:51:02.1541935Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5563 2022-09-27T15:51:02.1548797Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 5564 2022-09-27T15:51:02.1556460Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 5565 2022-09-27T15:51:03.7483967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:03.7484512Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:03.7485917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:03.7486394Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:03.7666910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:03.7667360Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:03.7670881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:03.7671606Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:03.8279775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:03.8280231Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:03.8283039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:03.8283510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:03.8543705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:03.8544211Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:03.8547031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:03.8547516Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:03.9942696Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiuvam8e_ 2022-09-27T15:51:03.9943298Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiuvam8e_/_remote_module_non_scriptable.py 2022-09-27T15:51:03.9955293Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpljigxl_1 2022-09-27T15:51:03.9958163Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpljigxl_1/_remote_module_non_scriptable.py 2022-09-27T15:51:04.0459986Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv6gn0anc 2022-09-27T15:51:04.0462528Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv6gn0anc/_remote_module_non_scriptable.py 2022-09-27T15:51:04.0816115Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa71nmmzo 2022-09-27T15:51:04.0818942Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa71nmmzo/_remote_module_non_scriptable.py 2022-09-27T15:51:04.4279218Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:51:04.4351487Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:51:04.4830051Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:51:04.5258036Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:51:04.5495714Z fi_getinfo: -61 2022-09-27T15:51:04.5566002Z fi_getinfo: -61 2022-09-27T15:51:04.6045947Z fi_getinfo: -61 2022-09-27T15:51:04.6473319Z fi_getinfo: -61 2022-09-27T15:51:07.0668472Z ok (6.428s) 2022-09-27T15:51:07.0668694Z 2022-09-27T15:51:07.0669109Z ---------------------------------------------------------------------- 2022-09-27T15:51:07.0669454Z Ran 1 test in 6.428s 2022-09-27T15:51:07.0669627Z 2022-09-27T15:51:07.0669723Z OK 2022-09-27T15:51:07.0669858Z 2022-09-27T15:51:07.0669996Z Generating XML reports... 2022-09-27T15:51:07.0705665Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20220927155100.xml 2022-09-27T15:51:09.0348358Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:09.0348868Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:09.0351118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:09.0351903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:09.2626275Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf4vx9dbg 2022-09-27T15:51:09.2627467Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf4vx9dbg/_remote_module_non_scriptable.py 2022-09-27T15:51:09.6894692Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:51:09.6909377Z 2022-09-27T15:51:09.6909913Z Running tests... 2022-09-27T15:51:09.6910350Z ---------------------------------------------------------------------- 2022-09-27T15:51:11.1405827Z test_gpu_to_cpu_continuation_gpu_root (__main__.TensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:51:11.1582926Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5985 2022-09-27T15:51:11.1589394Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5986 2022-09-27T15:51:11.1597248Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 5987 2022-09-27T15:51:11.1603447Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 5988 2022-09-27T15:51:12.7388275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:12.7388781Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:12.7389898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:12.7390376Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:12.7701553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:12.7701997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:12.7706028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:12.7706512Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:12.7873536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:12.7873998Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:12.7877803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:12.7878269Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:12.7919794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:12.7920228Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:12.7923701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:12.7924322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:12.9875710Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp75b9qsfi 2022-09-27T15:51:12.9876722Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp75b9qsfi/_remote_module_non_scriptable.py 2022-09-27T15:51:13.0116141Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx__q6350 2022-09-27T15:51:13.0119053Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx__q6350/_remote_module_non_scriptable.py 2022-09-27T15:51:13.0148547Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsskbmtn0 2022-09-27T15:51:13.0149075Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp88zy4ro5 2022-09-27T15:51:13.0151324Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsskbmtn0/_remote_module_non_scriptable.py 2022-09-27T15:51:13.0151863Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp88zy4ro5/_remote_module_non_scriptable.py 2022-09-27T15:51:13.4410162Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:51:13.4617207Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:51:13.4625220Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:51:13.4664756Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:51:13.5628849Z fi_getinfo: -61 2022-09-27T15:51:13.5834997Z fi_getinfo: -61 2022-09-27T15:51:13.5840860Z fi_getinfo: -61 2022-09-27T15:51:13.5882291Z fi_getinfo: -61 2022-09-27T15:51:16.0714353Z ok (6.380s) 2022-09-27T15:51:16.0714572Z 2022-09-27T15:51:16.0714977Z ---------------------------------------------------------------------- 2022-09-27T15:51:16.0715315Z Ran 1 test in 6.380s 2022-09-27T15:51:16.0715481Z 2022-09-27T15:51:16.0715576Z OK 2022-09-27T15:51:16.0718205Z 2022-09-27T15:51:16.0718675Z Generating XML reports... 2022-09-27T15:51:16.0751562Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20220927155109.xml 2022-09-27T15:51:18.0418114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:18.0418628Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:18.0421318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:18.0421794Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:18.2724536Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe52fyzpc 2022-09-27T15:51:18.2725999Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe52fyzpc/_remote_module_non_scriptable.py 2022-09-27T15:51:18.7000428Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:51:18.7014702Z 2022-09-27T15:51:18.7015072Z Running tests... 2022-09-27T15:51:18.7015875Z ---------------------------------------------------------------------- 2022-09-27T15:51:20.1611908Z test_input_moved_to_cuda_device (__main__.TensorPipeCudaRemoteModuleTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:51:20.1788084Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6408 2022-09-27T15:51:20.1794847Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6409 2022-09-27T15:51:21.8173141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:21.8173624Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:21.8174561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:21.8175340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:21.8248910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:21.8249361Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:21.8252631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:21.8253109Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:22.0446471Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_ogd1oxs 2022-09-27T15:51:22.0447283Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_ogd1oxs/_remote_module_non_scriptable.py 2022-09-27T15:51:22.0558547Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcbetoxqg 2022-09-27T15:51:22.0561244Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcbetoxqg/_remote_module_non_scriptable.py 2022-09-27T15:51:22.4860818Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:51:22.4868058Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:51:22.6080551Z fi_getinfo: -61 2022-09-27T15:51:22.6083985Z fi_getinfo: -61 2022-09-27T15:51:24.4883260Z ok (5.787s) 2022-09-27T15:51:24.4883514Z 2022-09-27T15:51:24.4884131Z ---------------------------------------------------------------------- 2022-09-27T15:51:24.4884531Z Ran 1 test in 5.787s 2022-09-27T15:51:24.4884697Z 2022-09-27T15:51:24.4884791Z OK 2022-09-27T15:51:24.4884908Z 2022-09-27T15:51:24.4885045Z Generating XML reports... 2022-09-27T15:51:24.4919224Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20220927155118.xml 2022-09-27T15:51:26.4672327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:26.4672839Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:26.4676338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:26.4676826Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:26.6957669Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfgl4eo0w 2022-09-27T15:51:26.6958866Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfgl4eo0w/_remote_module_non_scriptable.py 2022-09-27T15:51:27.1231180Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:51:27.1246487Z 2022-09-27T15:51:27.1246746Z Running tests... 2022-09-27T15:51:27.1247177Z ---------------------------------------------------------------------- 2022-09-27T15:51:28.5691746Z test_input_moved_to_cuda_device_script (__main__.TensorPipeCudaRemoteModuleTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:51:28.5867813Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6598 2022-09-27T15:51:28.5874636Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6599 2022-09-27T15:51:30.1737377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:30.1737887Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:30.1739689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:30.1740157Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:30.2022095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:30.2022566Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:30.2026607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:30.2027083Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:30.4121956Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpma4osgqy 2022-09-27T15:51:30.4122548Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpma4osgqy/_remote_module_non_scriptable.py 2022-09-27T15:51:30.4287657Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfwthn257 2022-09-27T15:51:30.4290876Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfwthn257/_remote_module_non_scriptable.py 2022-09-27T15:51:30.8415820Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:51:30.8706596Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:51:30.9631161Z fi_getinfo: -61 2022-09-27T15:51:30.9921653Z fi_getinfo: -61 2022-09-27T15:51:31.1345245Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpma4osgqy/_remote_module___torch___torch_testing__internal_distributed_nn_api_remote_module_test_MyModuleInterface.py 2022-09-27T15:51:31.1346401Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfwthn257/_remote_module___torch___torch_testing__internal_distributed_nn_api_remote_module_test_MyModuleInterface.py 2022-09-27T15:51:31.1493720Z INFO:torch.distributed.nn.jit.instantiator:Skipped writing /tmp/tmpfwthn257/_remote_module___torch___torch_testing__internal_distributed_nn_api_remote_module_test_MyModuleInterface.py 2022-09-27T15:51:32.9966286Z ok (5.872s) 2022-09-27T15:51:32.9966507Z 2022-09-27T15:51:32.9966898Z ---------------------------------------------------------------------- 2022-09-27T15:51:32.9967239Z Ran 1 test in 5.872s 2022-09-27T15:51:32.9967415Z 2022-09-27T15:51:32.9967494Z OK 2022-09-27T15:51:32.9967631Z 2022-09-27T15:51:32.9967785Z Generating XML reports... 2022-09-27T15:51:33.0003746Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20220927155127.xml 2022-09-27T15:51:34.9847458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:34.9848271Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:34.9849678Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:34.9850163Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:35.2185244Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbhgnm_sl 2022-09-27T15:51:35.2186290Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbhgnm_sl/_remote_module_non_scriptable.py 2022-09-27T15:51:35.6600004Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:51:35.6615272Z 2022-09-27T15:51:35.6615817Z Running tests... 2022-09-27T15:51:35.6616306Z ---------------------------------------------------------------------- 2022-09-27T15:51:37.1459081Z test_invalid_devices (__main__.TensorPipeCudaRemoteModuleTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:51:37.1642658Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6804 2022-09-27T15:51:37.1649032Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6805 2022-09-27T15:51:38.7438297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:38.7438908Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:38.7440388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:38.7440998Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:38.7733351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:38.7733818Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:38.7737185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:38.7737673Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:38.9871199Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxmzul04n 2022-09-27T15:51:38.9872240Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxmzul04n/_remote_module_non_scriptable.py 2022-09-27T15:51:39.0041827Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpis_4i2ss 2022-09-27T15:51:39.0044704Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpis_4i2ss/_remote_module_non_scriptable.py 2022-09-27T15:51:39.4313296Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:51:39.4504884Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:51:39.5532433Z fi_getinfo: -61 2022-09-27T15:51:39.5720765Z fi_getinfo: -61 2022-09-27T15:51:39.7527326Z On WorkerInfo(id=1, name=worker1): 2022-09-27T15:51:39.7549228Z RuntimeError('CUDA error: invalid device ordinal\nCUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.\nException raised from exchangeDevice at /var/lib/jenkins/workspace/c10/cuda/impl/CUDAGuardImpl.h:34 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fb15d9f550b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: + 0x14844 (0x7fb166f6a844 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)\nframe #2: + 0x10a43e8 (0x7fb15ecdb3e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #3: + 0x29d2bc5 (0x7fb160609bc5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #4: + 0x29d2d6b (0x7fb160609d6b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #5: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x1e3 (0x7fb168dc2ee3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x1f164b5 (0x7fb1690c34b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #7: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x168 (0x7fb168e00ea8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #8: + 0x11b3e7f (0x7fb168360e7f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x12e6 (0x7fb1686cad56 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x20e3ed3 (0x7fb169290ed3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fb168b13ec3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x1f16178 (0x7fb1690c3178 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fb168b13ec3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #14: + 0x3207211 (0x7fb16a3b4211 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #15: + 0x32077bb (0x7fb16a3b47bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #16: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7fb168b6a801 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #17: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7fb1686c329e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #18: + 0x229ce89 (0x7fb169449e89 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #19: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7fb168cc99f5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #20: + 0x34a02f (0x7fb17523602f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #21: + 0x34a4db (0x7fb1752364db in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #22: + 0x1ddc68 (0x55819db53c68 in /opt/conda/bin/python)\nframe #23: + 0x1049f3 (0x55819da7a9f3 in /opt/conda/bin/python)\nframe #24: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python)\nframe #25: + 0x104425 (0x55819da7a425 in /opt/conda/bin/python)\nframe #26: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python)\nframe #27: + 0x18fc9b (0x55819db05c9b in /opt/conda/bin/python)\nframe #28: + 0x1052a5 (0x55819da7b2a5 in /opt/conda/bin/python)\nframe #29: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python)\nframe #30: + 0x18fc9b (0x55819db05c9b in /opt/conda/bin/python)\nframe #31: + 0x1052a5 (0x55819da7b2a5 in /opt/conda/bin/python)\nframe #32: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python)\nframe #33: _PyObject_Call + 0x20a (0x55819dabdfaa in /opt/conda/bin/python)\nframe #34: _PyEval_EvalFrameDefault + 0x26e4 (0x55819db59774 in /opt/conda/bin/python)\nframe #35: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python)\nframe #36: _PyObject_Call + 0x20a (0x55819dabdfaa in /opt/conda/bin/python)\nframe #37: + 0xa53d8a (0x7fb17593fd8a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #38: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fb17593dfcd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #39: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fb1759412a5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #40: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7fb175941973 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #41: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7fb16b967c04 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #42: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fb175941095 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #43: + 0x47b3f43 (0x7fb16b960f43 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #44: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fb16b961ad8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #45: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fb16b95bfd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #46: + 0x47e3a02 (0x7fb16b990a02 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #47: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fb15d9e393b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #48: + 0xc9039 (0x7fb18d007039 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #49: + 0x76db (0x7fb1ad5b76db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #50: clone + 0x3f (0x7fb1ad2e061f in /lib/x86_64-linux-gnu/libc.so.6)\n') 2022-09-27T15:51:39.7559446Z Traceback (most recent call last): 2022-09-27T15:51:39.7560001Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 206, in _run_function 2022-09-27T15:51:39.7560456Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-09-27T15:51:39.7561040Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py", line 91, in _create_module 2022-09-27T15:51:39.7561413Z module.to(device) 2022-09-27T15:51:39.7561979Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 987, in to 2022-09-27T15:51:39.7562328Z return self._apply(convert) 2022-09-27T15:51:39.7562805Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 662, in _apply 2022-09-27T15:51:39.7563176Z param_applied = fn(param) 2022-09-27T15:51:39.7563633Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 985, in convert 2022-09-27T15:51:39.7564093Z return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) 2022-09-27T15:51:39.7564484Z RuntimeError: CUDA error: invalid device ordinal 2022-09-27T15:51:39.7579644Z CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. 2022-09-27T15:51:39.7580169Z For debugging consider passing CUDA_LAUNCH_BLOCKING=1. 2022-09-27T15:51:39.7580658Z Exception raised from exchangeDevice at /var/lib/jenkins/workspace/c10/cuda/impl/CUDAGuardImpl.h:34 (most recent call first): 2022-09-27T15:51:39.7581599Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fb15d9f550b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T15:51:39.7582373Z frame #1: + 0x14844 (0x7fb166f6a844 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so) 2022-09-27T15:51:39.7583048Z frame #2: + 0x10a43e8 (0x7fb15ecdb3e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-09-27T15:51:39.7583840Z frame #3: + 0x29d2bc5 (0x7fb160609bc5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-09-27T15:51:39.7584515Z frame #4: + 0x29d2d6b (0x7fb160609d6b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-09-27T15:51:39.7585592Z frame #5: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x1e3 (0x7fb168dc2ee3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7586493Z frame #6: + 0x1f164b5 (0x7fb1690c34b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7587496Z frame #7: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x168 (0x7fb168e00ea8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7588354Z frame #8: + 0x11b3e7f (0x7fb168360e7f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7589339Z frame #9: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x12e6 (0x7fb1686cad56 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7590195Z frame #10: + 0x20e3ed3 (0x7fb169290ed3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7591553Z frame #11: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fb168b13ec3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7592465Z frame #12: + 0x1f16178 (0x7fb1690c3178 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7593527Z frame #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fb168b13ec3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7594419Z frame #14: + 0x3207211 (0x7fb16a3b4211 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7595078Z frame #15: + 0x32077bb (0x7fb16a3b47bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7596161Z frame #16: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7fb168b6a801 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7597360Z frame #17: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7fb1686c329e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7598209Z frame #18: + 0x229ce89 (0x7fb169449e89 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7599240Z frame #19: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7fb168cc99f5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7600182Z frame #20: + 0x34a02f (0x7fb17523602f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7600878Z frame #21: + 0x34a4db (0x7fb1752364db in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7601367Z frame #22: + 0x1ddc68 (0x55819db53c68 in /opt/conda/bin/python) 2022-09-27T15:51:39.7601787Z frame #23: + 0x1049f3 (0x55819da7a9f3 in /opt/conda/bin/python) 2022-09-27T15:51:39.7602184Z frame #24: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python) 2022-09-27T15:51:39.7602594Z frame #25: + 0x104425 (0x55819da7a425 in /opt/conda/bin/python) 2022-09-27T15:51:39.7603006Z frame #26: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python) 2022-09-27T15:51:39.7603405Z frame #27: + 0x18fc9b (0x55819db05c9b in /opt/conda/bin/python) 2022-09-27T15:51:39.7603821Z frame #28: + 0x1052a5 (0x55819da7b2a5 in /opt/conda/bin/python) 2022-09-27T15:51:39.7604230Z frame #29: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python) 2022-09-27T15:51:39.7604647Z frame #30: + 0x18fc9b (0x55819db05c9b in /opt/conda/bin/python) 2022-09-27T15:51:39.7605048Z frame #31: + 0x1052a5 (0x55819da7b2a5 in /opt/conda/bin/python) 2022-09-27T15:51:39.7605459Z frame #32: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python) 2022-09-27T15:51:39.7605872Z frame #33: _PyObject_Call + 0x20a (0x55819dabdfaa in /opt/conda/bin/python) 2022-09-27T15:51:39.7606305Z frame #34: _PyEval_EvalFrameDefault + 0x26e4 (0x55819db59774 in /opt/conda/bin/python) 2022-09-27T15:51:39.7606715Z frame #35: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python) 2022-09-27T15:51:39.7607123Z frame #36: _PyObject_Call + 0x20a (0x55819dabdfaa in /opt/conda/bin/python) 2022-09-27T15:51:39.7607758Z frame #37: + 0xa53d8a (0x7fb17593fd8a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7608579Z frame #38: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fb17593dfcd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7609641Z frame #39: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fb1759412a5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7610831Z frame #40: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7fb175941973 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7612168Z frame #41: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7fb16b967c04 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7613530Z frame #42: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fb175941095 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7614467Z frame #43: + 0x47b3f43 (0x7fb16b960f43 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7615435Z frame #44: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fb16b961ad8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7616634Z frame #45: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fb16b95bfd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7617476Z frame #46: + 0x47e3a02 (0x7fb16b990a02 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7618191Z frame #47: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fb15d9e393b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T15:51:39.7618714Z frame #48: + 0xc9039 (0x7fb18d007039 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-09-27T15:51:39.7619270Z frame #49: + 0x76db (0x7fb1ad5b76db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-09-27T15:51:39.7619796Z frame #50: clone + 0x3f (0x7fb1ad2e061f in /lib/x86_64-linux-gnu/libc.so.6) 2022-09-27T15:51:39.7620040Z 2022-09-27T15:51:39.7620059Z 2022-09-27T15:51:39.7620199Z On WorkerInfo(id=1, name=worker1): 2022-09-27T15:51:39.7656086Z RuntimeError('On WorkerInfo(id=1, name=worker1):\nRuntimeError(\'CUDA error: invalid device ordinal\nCUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.\nException raised from exchangeDevice at /var/lib/jenkins/workspace/c10/cuda/impl/CUDAGuardImpl.h:34 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fb15d9f550b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: + 0x14844 (0x7fb166f6a844 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)\nframe #2: + 0x10a43e8 (0x7fb15ecdb3e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #3: + 0x29d2bc5 (0x7fb160609bc5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #4: + 0x29d2d6b (0x7fb160609d6b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #5: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x1e3 (0x7fb168dc2ee3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x1f164b5 (0x7fb1690c34b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #7: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x168 (0x7fb168e00ea8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #8: + 0x11b3e7f (0x7fb168360e7f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x12e6 (0x7fb1686cad56 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x20e3ed3 (0x7fb169290ed3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fb168b13ec3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x1f16178 (0x7fb1690c3178 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fb168b13ec3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #14: + 0x3207211 (0x7fb16a3b4211 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #15: + 0x32077bb (0x7fb16a3b47bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #16: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7fb168b6a801 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #17: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7fb1686c329e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #18: + 0x229ce89 (0x7fb169449e89 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #19: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7fb168cc99f5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #20: + 0x34a02f (0x7fb17523602f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #21: + 0x34a4db (0x7fb1752364db in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #22: + 0x1ddc68 (0x55819db53c68 in /opt/conda/bin/python)\nframe #23: + 0x1049f3 (0x55819da7a9f3 in /opt/conda/bin/python)\nframe #24: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python)\nframe #25: + 0x104425 (0x55819da7a425 in /opt/conda/bin/python)\nframe #26: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python)\nframe #27: + 0x18fc9b (0x55819db05c9b in /opt/conda/bin/python)\nframe #28: + 0x1052a5 (0x55819da7b2a5 in /opt/conda/bin/python)\nframe #29: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python)\nframe #30: + 0x18fc9b (0x55819db05c9b in /opt/conda/bin/python)\nframe #31: + 0x1052a5 (0x55819da7b2a5 in /opt/conda/bin/python)\nframe #32: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python)\nframe #33: _PyObject_Call + 0x20a (0x55819dabdfaa in /opt/conda/bin/python)\nframe #34: _PyEval_EvalFrameDefault + 0x26e4 (0x55819db59774 in /opt/conda/bin/python)\nframe #35: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python)\nframe #36: _PyObject_Call + 0x20a (0x55819dabdfaa in /opt/conda/bin/python)\nframe #37: + 0xa53d8a (0x7fb17593fd8a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #38: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fb17593dfcd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #39: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fb1759412a5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #40: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7fb175941973 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #41: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7fb16b967c04 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #42: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fb175941095 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #43: + 0x47b3f43 (0x7fb16b960f43 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #44: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fb16b961ad8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #45: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fb16b95bfd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #46: + 0x47e3a02 (0x7fb16b990a02 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #47: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fb15d9e393b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #48: + 0xc9039 (0x7fb18d007039 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #49: + 0x76db (0x7fb1ad5b76db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #50: clone + 0x3f (0x7fb1ad2e061f in /lib/x86_64-linux-gnu/libc.so.6)\n\')\nTraceback (most recent call last):\n File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 206, in _run_function\n result = python_udf.func(*python_udf.args, **python_udf.kwargs)\n File "/opt/conda/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py", line 91, in _create_module\n module.to(device)\n File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 987, in to\n return self._apply(convert)\n File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 662, in _apply\n param_applied = fn(param)\n File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 985, in convert\n return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)\nRuntimeError: CUDA error: invalid device ordinal\nCUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.\nException raised from exchangeDevice at /var/lib/jenkins/workspace/c10/cuda/impl/CUDAGuardImpl.h:34 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fb15d9f550b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: + 0x14844 (0x7fb166f6a844 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)\nframe #2: + 0x10a43e8 (0x7fb15ecdb3e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #3: + 0x29d2bc5 (0x7fb160609bc5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #4: + 0x29d2d6b (0x7fb160609d6b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #5: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x1e3 (0x7fb168dc2ee3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x1f164b5 (0x7fb1690c34b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #7: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x168 (0x7fb168e00ea8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #8: + 0x11b3e7f (0x7fb168360e7f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x12e6 (0x7fb1686cad56 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x20e3ed3 (0x7fb169290ed3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fb168b13ec3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x1f16178 (0x7fb1690c3178 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fb168b13ec3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #14: + 0x3207211 (0x7fb16a3b4211 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #15: + 0x32077bb (0x7fb16a3b47bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #16: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7fb168b6a801 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #17: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7fb1686c329e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #18: + 0x229ce89 (0x7fb169449e89 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #19: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7fb168cc99f5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #20: + 0x34a02f (0x7fb17523602f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #21: + 0x34a4db (0x7fb1752364db in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #22: + 0x1ddc68 (0x55819db53c68 in /opt/conda/bin/python)\nframe #23: + 0x1049f3 (0x55819da7a9f3 in /opt/conda/bin/python)\nframe #24: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python)\nframe #25: + 0x104425 (0x55819da7a425 in /opt/conda/bin/python)\nframe #26: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python)\nframe #27: + 0x18fc9b (0x55819db05c9b in /opt/conda/bin/python)\nframe #28: + 0x1052a5 (0x55819da7b2a5 in /opt/conda/bin/python)\nframe #29: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python)\nframe #30: + 0x18fc9b (0x55819db05c9b in /opt/conda/bin/python)\nframe #31: + 0x1052a5 (0x55819da7b2a5 in /opt/conda/bin/python)\nframe #32: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python)\nframe #33: _PyObject_Call + 0x20a (0x55819dabdfaa in /opt/conda/bin/python)\nframe #34: _PyEval_EvalFrameDefault + 0x26e4 (0x55819db59774 in /opt/conda/bin/python)\nframe #35: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python)\nframe #36: _PyObject_Call + 0x20a (0x55819dabdfaa in /opt/conda/bin/python)\nframe #37: + 0xa53d8a (0x7fb17593fd8a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #38: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fb17593dfcd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #39: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fb1759412a5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #40: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7fb175941973 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #41: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7fb16b967c04 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #42: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fb175941095 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #43: + 0x47b3f43 (0x7fb16b960f43 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #44: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fb16b961ad8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #45: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fb16b95bfd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #46: + 0x47e3a02 (0x7fb16b990a02 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #47: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fb15d9e393b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #48: + 0xc9039 (0x7fb18d007039 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #49: + 0x76db (0x7fb1ad5b76db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #50: clone + 0x3f (0x7fb1ad2e061f in /lib/x86_64-linux-gnu/libc.so.6)\n\n') 2022-09-27T15:51:39.7677124Z Traceback (most recent call last): 2022-09-27T15:51:39.7677665Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 206, in _run_function 2022-09-27T15:51:39.7678201Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-09-27T15:51:39.7678634Z File "/tmp/tmpbhgnm_sl/_remote_module_non_scriptable.py", line 47, in _remote_forward 2022-09-27T15:51:39.7678980Z module = module_rref.local_value() 2022-09-27T15:51:39.7679504Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 220, in _handle_exception 2022-09-27T15:51:39.7680076Z raise result.exception_type(result.msg.encode("utf-8").decode("unicode_escape")) 2022-09-27T15:51:39.7680455Z RuntimeError: On WorkerInfo(id=1, name=worker1): 2022-09-27T15:51:39.7680858Z RuntimeError('CUDA error: invalid device ordinal 2022-09-27T15:51:39.7681304Z CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. 2022-09-27T15:51:39.7681828Z For debugging consider passing CUDA_LAUNCH_BLOCKING=1. 2022-09-27T15:51:39.7682288Z Exception raised from exchangeDevice at /var/lib/jenkins/workspace/c10/cuda/impl/CUDAGuardImpl.h:34 (most recent call first): 2022-09-27T15:51:39.7683135Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fb15d9f550b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T15:51:39.7683869Z frame #1: + 0x14844 (0x7fb166f6a844 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so) 2022-09-27T15:51:39.7684516Z frame #2: + 0x10a43e8 (0x7fb15ecdb3e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-09-27T15:51:39.7685146Z frame #3: + 0x29d2bc5 (0x7fb160609bc5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-09-27T15:51:39.7685782Z frame #4: + 0x29d2d6b (0x7fb160609d6b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-09-27T15:51:39.7686808Z frame #5: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x1e3 (0x7fb168dc2ee3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7687649Z frame #6: + 0x1f164b5 (0x7fb1690c34b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7688591Z frame #7: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x168 (0x7fb168e00ea8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7689369Z frame #8: + 0x11b3e7f (0x7fb168360e7f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7690318Z frame #9: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x12e6 (0x7fb1686cad56 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7691117Z frame #10: + 0x20e3ed3 (0x7fb169290ed3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7692160Z frame #11: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fb168b13ec3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7692994Z frame #12: + 0x1f16178 (0x7fb1690c3178 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7694034Z frame #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fb168b13ec3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7694881Z frame #14: + 0x3207211 (0x7fb16a3b4211 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7695512Z frame #15: + 0x32077bb (0x7fb16a3b47bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7696460Z frame #16: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7fb168b6a801 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7697614Z frame #17: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7fb1686c329e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7698406Z frame #18: + 0x229ce89 (0x7fb169449e89 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7699352Z frame #19: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7fb168cc99f5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7700169Z frame #20: + 0x34a02f (0x7fb17523602f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7700810Z frame #21: + 0x34a4db (0x7fb1752364db in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7701271Z frame #22: + 0x1ddc68 (0x55819db53c68 in /opt/conda/bin/python) 2022-09-27T15:51:39.7701656Z frame #23: + 0x1049f3 (0x55819da7a9f3 in /opt/conda/bin/python) 2022-09-27T15:51:39.7702040Z frame #24: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python) 2022-09-27T15:51:39.7702425Z frame #25: + 0x104425 (0x55819da7a425 in /opt/conda/bin/python) 2022-09-27T15:51:39.7702792Z frame #26: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python) 2022-09-27T15:51:39.7703175Z frame #27: + 0x18fc9b (0x55819db05c9b in /opt/conda/bin/python) 2022-09-27T15:51:39.7703561Z frame #28: + 0x1052a5 (0x55819da7b2a5 in /opt/conda/bin/python) 2022-09-27T15:51:39.7703942Z frame #29: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python) 2022-09-27T15:51:39.7704308Z frame #30: + 0x18fc9b (0x55819db05c9b in /opt/conda/bin/python) 2022-09-27T15:51:39.7704689Z frame #31: + 0x1052a5 (0x55819da7b2a5 in /opt/conda/bin/python) 2022-09-27T15:51:39.7705078Z frame #32: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python) 2022-09-27T15:51:39.7705442Z frame #33: _PyObject_Call + 0x20a (0x55819dabdfaa in /opt/conda/bin/python) 2022-09-27T15:51:39.7705854Z frame #34: _PyEval_EvalFrameDefault + 0x26e4 (0x55819db59774 in /opt/conda/bin/python) 2022-09-27T15:51:39.7706260Z frame #35: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python) 2022-09-27T15:51:39.7706638Z frame #36: _PyObject_Call + 0x20a (0x55819dabdfaa in /opt/conda/bin/python) 2022-09-27T15:51:39.7707214Z frame #37: + 0xa53d8a (0x7fb17593fd8a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7707998Z frame #38: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fb17593dfcd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7709050Z frame #39: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fb1759412a5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7710181Z frame #40: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7fb175941973 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7711756Z frame #41: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7fb16b967c04 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7713133Z frame #42: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fb175941095 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7713991Z frame #43: + 0x47b3f43 (0x7fb16b960f43 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7714910Z frame #44: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fb16b961ad8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7715965Z frame #45: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fb16b95bfd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7716749Z frame #46: + 0x47e3a02 (0x7fb16b990a02 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7717421Z frame #47: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fb15d9e393b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T15:51:39.7717904Z frame #48: + 0xc9039 (0x7fb18d007039 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-09-27T15:51:39.7718444Z frame #49: + 0x76db (0x7fb1ad5b76db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-09-27T15:51:39.7718943Z frame #50: clone + 0x3f (0x7fb1ad2e061f in /lib/x86_64-linux-gnu/libc.so.6) 2022-09-27T15:51:39.7719243Z ') 2022-09-27T15:51:39.7719491Z Traceback (most recent call last): 2022-09-27T15:51:39.7720006Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 206, in _run_function 2022-09-27T15:51:39.7720461Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-09-27T15:51:39.7721020Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py", line 91, in _create_module 2022-09-27T15:51:39.7721401Z module.to(device) 2022-09-27T15:51:39.7721856Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 987, in to 2022-09-27T15:51:39.7722202Z return self._apply(convert) 2022-09-27T15:51:39.7722671Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 662, in _apply 2022-09-27T15:51:39.7723041Z param_applied = fn(param) 2022-09-27T15:51:39.7723495Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 985, in convert 2022-09-27T15:51:39.7723957Z return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) 2022-09-27T15:51:39.7724350Z RuntimeError: CUDA error: invalid device ordinal 2022-09-27T15:51:39.7724797Z CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. 2022-09-27T15:51:39.7725229Z For debugging consider passing CUDA_LAUNCH_BLOCKING=1. 2022-09-27T15:51:39.7725764Z Exception raised from exchangeDevice at /var/lib/jenkins/workspace/c10/cuda/impl/CUDAGuardImpl.h:34 (most recent call first): 2022-09-27T15:51:39.7726611Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fb15d9f550b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T15:51:39.7727337Z frame #1: + 0x14844 (0x7fb166f6a844 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so) 2022-09-27T15:51:39.7727961Z frame #2: + 0x10a43e8 (0x7fb15ecdb3e8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-09-27T15:51:39.7728609Z frame #3: + 0x29d2bc5 (0x7fb160609bc5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-09-27T15:51:39.7729301Z frame #4: + 0x29d2d6b (0x7fb160609d6b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-09-27T15:51:39.7730313Z frame #5: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x1e3 (0x7fb168dc2ee3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7731151Z frame #6: + 0x1f164b5 (0x7fb1690c34b5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7732080Z frame #7: at::_ops::empty_strided::call(c10::ArrayRef, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x168 (0x7fb168e00ea8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7732872Z frame #8: + 0x11b3e7f (0x7fb168360e7f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7733809Z frame #9: at::native::_to_copy(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x12e6 (0x7fb1686cad56 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7734611Z frame #10: + 0x20e3ed3 (0x7fb169290ed3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7735621Z frame #11: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fb168b13ec3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7736444Z frame #12: + 0x1f16178 (0x7fb1690c3178 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7737445Z frame #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x103 (0x7fb168b13ec3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7738283Z frame #14: + 0x3207211 (0x7fb16a3b4211 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7738913Z frame #15: + 0x32077bb (0x7fb16a3b47bb in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7739862Z frame #16: at::_ops::_to_copy::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, c10::optional) + 0x201 (0x7fb168b6a801 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7740995Z frame #17: at::native::to(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x13e (0x7fb1686c329e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7741802Z frame #18: + 0x229ce89 (0x7fb169449e89 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7742771Z frame #19: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional, c10::optional, c10::optional, c10::optional, bool, bool, c10::optional) + 0x215 (0x7fb168cc99f5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7743586Z frame #20: + 0x34a02f (0x7fb17523602f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7744282Z frame #21: + 0x34a4db (0x7fb1752364db in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7744730Z frame #22: + 0x1ddc68 (0x55819db53c68 in /opt/conda/bin/python) 2022-09-27T15:51:39.7745127Z frame #23: + 0x1049f3 (0x55819da7a9f3 in /opt/conda/bin/python) 2022-09-27T15:51:39.7745516Z frame #24: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python) 2022-09-27T15:51:39.7745893Z frame #25: + 0x104425 (0x55819da7a425 in /opt/conda/bin/python) 2022-09-27T15:51:39.7746259Z frame #26: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python) 2022-09-27T15:51:39.7746644Z frame #27: + 0x18fc9b (0x55819db05c9b in /opt/conda/bin/python) 2022-09-27T15:51:39.7747034Z frame #28: + 0x1052a5 (0x55819da7b2a5 in /opt/conda/bin/python) 2022-09-27T15:51:39.7747395Z frame #29: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python) 2022-09-27T15:51:39.7747774Z frame #30: + 0x18fc9b (0x55819db05c9b in /opt/conda/bin/python) 2022-09-27T15:51:39.7748159Z frame #31: + 0x1052a5 (0x55819da7b2a5 in /opt/conda/bin/python) 2022-09-27T15:51:39.7748542Z frame #32: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python) 2022-09-27T15:51:39.7748910Z frame #33: _PyObject_Call + 0x20a (0x55819dabdfaa in /opt/conda/bin/python) 2022-09-27T15:51:39.7749320Z frame #34: _PyEval_EvalFrameDefault + 0x26e4 (0x55819db59774 in /opt/conda/bin/python) 2022-09-27T15:51:39.7749726Z frame #35: + 0x18f742 (0x55819db05742 in /opt/conda/bin/python) 2022-09-27T15:51:39.7750091Z frame #36: _PyObject_Call + 0x20a (0x55819dabdfaa in /opt/conda/bin/python) 2022-09-27T15:51:39.7751124Z frame #37: + 0xa53d8a (0x7fb17593fd8a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7751937Z frame #38: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fb17593dfcd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7752942Z frame #39: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fb1759412a5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7754059Z frame #40: torch::distributed::rpc::RequestCallbackImpl::processPythonRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x83 (0x7fb175941973 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7755252Z frame #41: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x194 (0x7fb16b967c04 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7756586Z frame #42: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fb175941095 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T15:51:39.7757474Z frame #43: + 0x47b3f43 (0x7fb16b960f43 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7758398Z frame #44: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fb16b961ad8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7759457Z frame #45: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fb16b95bfd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7760310Z frame #46: + 0x47e3a02 (0x7fb16b990a02 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T15:51:39.7760972Z frame #47: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fb15d9e393b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T15:51:39.7761469Z frame #48: + 0xc9039 (0x7fb18d007039 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-09-27T15:51:39.7762013Z frame #49: + 0x76db (0x7fb1ad5b76db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-09-27T15:51:39.7762512Z frame #50: clone + 0x3f (0x7fb1ad2e061f in /lib/x86_64-linux-gnu/libc.so.6) 2022-09-27T15:51:39.7762717Z 2022-09-27T15:51:39.7762755Z 2022-09-27T15:51:39.7762774Z 2022-09-27T15:51:40.1716354Z ok (4.510s) 2022-09-27T15:51:40.1716747Z 2022-09-27T15:51:40.1717202Z ---------------------------------------------------------------------- 2022-09-27T15:51:40.1717525Z Ran 1 test in 4.510s 2022-09-27T15:51:40.1717689Z 2022-09-27T15:51:40.1717792Z OK 2022-09-27T15:51:40.1717926Z 2022-09-27T15:51:40.1718058Z Generating XML reports... 2022-09-27T15:51:40.1752982Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20220927155135.xml 2022-09-27T15:51:42.1698363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:42.1699363Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:42.1700516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:42.1701485Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:42.4075987Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpad8wrw1g 2022-09-27T15:51:42.4077166Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpad8wrw1g/_remote_module_non_scriptable.py 2022-09-27T15:51:42.8497566Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:51:42.8514669Z 2022-09-27T15:51:42.8515072Z Running tests... 2022-09-27T15:51:42.8515585Z ---------------------------------------------------------------------- 2022-09-27T15:51:44.3453104Z test_valid_device (__main__.TensorPipeCudaRemoteModuleTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:51:44.3636522Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6993 2022-09-27T15:51:44.3643085Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6994 2022-09-27T15:51:45.9681623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:45.9682161Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:45.9683415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:45.9683908Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:46.0163465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:46.0163938Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:46.0167156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:46.0167630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:46.1997460Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvnrwl54m 2022-09-27T15:51:46.1998415Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvnrwl54m/_remote_module_non_scriptable.py 2022-09-27T15:51:46.2388955Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp980i6wn 2022-09-27T15:51:46.2391451Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp980i6wn/_remote_module_non_scriptable.py 2022-09-27T15:51:46.6273378Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:51:46.6741678Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:51:46.7491189Z fi_getinfo: -61 2022-09-27T15:51:46.7958334Z fi_getinfo: -61 2022-09-27T15:51:48.6732678Z ok (5.821s) 2022-09-27T15:51:48.6733008Z 2022-09-27T15:51:48.6733594Z ---------------------------------------------------------------------- 2022-09-27T15:51:48.6733942Z Ran 1 test in 5.822s 2022-09-27T15:51:48.6734108Z 2022-09-27T15:51:48.6734202Z OK 2022-09-27T15:51:48.6734328Z 2022-09-27T15:51:48.6734464Z Generating XML reports... 2022-09-27T15:51:48.6768470Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20220927155142.xml 2022-09-27T15:51:50.6562034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:50.6562560Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:50.6563868Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:50.6564359Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:50.8882138Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdh8veuum 2022-09-27T15:51:50.8883535Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdh8veuum/_remote_module_non_scriptable.py 2022-09-27T15:51:51.3178666Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:51:51.3193574Z 2022-09-27T15:51:51.3193874Z Running tests... 2022-09-27T15:51:51.3194313Z ---------------------------------------------------------------------- 2022-09-27T15:51:52.7866786Z test_profiler_remote_cuda (__main__.TensorPipeCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:51:52.8045142Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7183 2022-09-27T15:51:52.8051223Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7184 2022-09-27T15:51:52.8057354Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 7185 2022-09-27T15:51:52.8063881Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 7186 2022-09-27T15:51:54.3909613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:54.3910611Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:54.3912156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:54.3913112Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:54.4388903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:54.4389855Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:54.4393829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:54.4394781Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:54.4402436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:54.4403392Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:54.4406685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:54.4407902Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:54.4571368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:51:54.4572280Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:51:54.4575815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:51:54.4576786Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:51:54.6299703Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmperh6_5ae 2022-09-27T15:51:54.6300889Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmperh6_5ae/_remote_module_non_scriptable.py 2022-09-27T15:51:54.6766494Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp62o654ac 2022-09-27T15:51:54.6768367Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp62o654ac/_remote_module_non_scriptable.py 2022-09-27T15:51:54.6842178Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx7_q2frb 2022-09-27T15:51:54.6844789Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx7_q2frb/_remote_module_non_scriptable.py 2022-09-27T15:51:54.6926273Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt00afrjw 2022-09-27T15:51:54.6928671Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt00afrjw/_remote_module_non_scriptable.py 2022-09-27T15:51:55.0586587Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:51:55.1161680Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:51:55.1376518Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:51:55.1441895Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:51:55.1802445Z fi_getinfo: -61 2022-09-27T15:51:55.2377701Z fi_getinfo: -61 2022-09-27T15:51:55.2591789Z fi_getinfo: -61 2022-09-27T15:51:55.2657344Z fi_getinfo: -61 2022-09-27T15:51:59.8242384Z ok (8.504s) 2022-09-27T15:51:59.8242616Z 2022-09-27T15:51:59.8243468Z ---------------------------------------------------------------------- 2022-09-27T15:51:59.8243847Z Ran 1 test in 8.505s 2022-09-27T15:51:59.8244015Z 2022-09-27T15:51:59.8244103Z OK 2022-09-27T15:51:59.8244238Z 2022-09-27T15:51:59.8244377Z Generating XML reports... 2022-09-27T15:51:59.8279460Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRpcTest-20220927155151.xml 2022-09-27T15:52:01.7926409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:01.7929202Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:01.7929845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:01.7930615Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:02.0576534Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyuhza9pq 2022-09-27T15:52:02.0577333Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyuhza9pq/_remote_module_non_scriptable.py 2022-09-27T15:52:02.4857866Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:52:02.4872256Z 2022-09-27T15:52:02.4872712Z Running tests... 2022-09-27T15:52:02.4873210Z ---------------------------------------------------------------------- 2022-09-27T15:52:03.9726899Z test_basic_gloo_ckpt_always (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:52:03.9902790Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7530 2022-09-27T15:52:03.9909483Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7531 2022-09-27T15:52:05.6119330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:05.6119839Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:05.6121267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:05.6121746Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:05.6181907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:05.6182366Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:05.6185718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:05.6186216Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:05.8435642Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkt0a5l7p 2022-09-27T15:52:05.8436614Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkt0a5l7p/_remote_module_non_scriptable.py 2022-09-27T15:52:05.8495626Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxyq4agu7 2022-09-27T15:52:05.8498219Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxyq4agu7/_remote_module_non_scriptable.py 2022-09-27T15:52:06.2932053Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:52:06.2935179Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:52:06.6967925Z skip: Need at least 4 CUDA devices (4.209s) 2022-09-27T15:52:06.6968170Z 2022-09-27T15:52:06.6968552Z ---------------------------------------------------------------------- 2022-09-27T15:52:06.6968915Z Ran 1 test in 4.209s 2022-09-27T15:52:06.6969079Z 2022-09-27T15:52:06.6969188Z OK (skipped=1) 2022-09-27T15:52:06.6969342Z 2022-09-27T15:52:06.6969476Z Generating XML reports... 2022-09-27T15:52:06.7005045Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220927155202.xml 2022-09-27T15:52:08.6775362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:08.6775861Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:08.6777574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:08.6778053Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:08.9079973Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2b_orrjj 2022-09-27T15:52:08.9080864Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2b_orrjj/_remote_module_non_scriptable.py 2022-09-27T15:52:09.3362838Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:52:09.3377373Z 2022-09-27T15:52:09.3377991Z Running tests... 2022-09-27T15:52:09.3378507Z ---------------------------------------------------------------------- 2022-09-27T15:52:10.8415321Z test_basic_gloo_ckpt_except_last (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:52:10.8599276Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7633 2022-09-27T15:52:10.8606226Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7634 2022-09-27T15:52:12.4981588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:12.4982087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:12.4983586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:12.4984085Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:12.5007497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:12.5007958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:12.5011621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:12.5012106Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:12.7285709Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqbv73np2 2022-09-27T15:52:12.7286584Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqbv73np2/_remote_module_non_scriptable.py 2022-09-27T15:52:12.7431239Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg74uzx6z 2022-09-27T15:52:12.7434475Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg74uzx6z/_remote_module_non_scriptable.py 2022-09-27T15:52:13.1839455Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:52:13.1913966Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:52:13.5666533Z skip: Need at least 4 CUDA devices (4.229s) 2022-09-27T15:52:13.5666779Z 2022-09-27T15:52:13.5667182Z ---------------------------------------------------------------------- 2022-09-27T15:52:13.5667528Z Ran 1 test in 4.229s 2022-09-27T15:52:13.5667692Z 2022-09-27T15:52:13.5667785Z OK (skipped=1) 2022-09-27T15:52:13.5667948Z 2022-09-27T15:52:13.5668076Z Generating XML reports... 2022-09-27T15:52:13.5707149Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220927155209.xml 2022-09-27T15:52:15.5383141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:15.5383635Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:15.5385779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:15.5386268Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:15.7671649Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1fla8bn3 2022-09-27T15:52:15.7672259Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1fla8bn3/_remote_module_non_scriptable.py 2022-09-27T15:52:16.1953593Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:52:16.1968444Z 2022-09-27T15:52:16.1968590Z Running tests... 2022-09-27T15:52:16.1969530Z ---------------------------------------------------------------------- 2022-09-27T15:52:17.6413033Z test_basic_gloo_ckpt_never (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:52:17.6588649Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7736 2022-09-27T15:52:17.6595711Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7737 2022-09-27T15:52:19.2253510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:19.2254131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:19.2255929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:19.2256405Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:19.2335199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:19.2335949Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:19.2338657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:19.2339135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:19.4528096Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwkvusvca 2022-09-27T15:52:19.4529024Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwkvusvca/_remote_module_non_scriptable.py 2022-09-27T15:52:19.4610330Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq6acghkk 2022-09-27T15:52:19.4613132Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq6acghkk/_remote_module_non_scriptable.py 2022-09-27T15:52:19.8878674Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:52:19.8935376Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:52:20.2651628Z skip: Need at least 4 CUDA devices (4.068s) 2022-09-27T15:52:20.2651862Z 2022-09-27T15:52:20.2652263Z ---------------------------------------------------------------------- 2022-09-27T15:52:20.2652599Z Ran 1 test in 4.068s 2022-09-27T15:52:20.2652764Z 2022-09-27T15:52:20.2652876Z OK (skipped=1) 2022-09-27T15:52:20.2653031Z 2022-09-27T15:52:20.2653155Z Generating XML reports... 2022-09-27T15:52:20.2688701Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220927155216.xml 2022-09-27T15:52:22.2109988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:22.2111058Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:22.2112287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:22.2112793Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:22.4423515Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy77wnzb4 2022-09-27T15:52:22.4425155Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy77wnzb4/_remote_module_non_scriptable.py 2022-09-27T15:52:22.8716855Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:52:22.8731065Z 2022-09-27T15:52:22.8731486Z Running tests... 2022-09-27T15:52:22.8732423Z ---------------------------------------------------------------------- 2022-09-27T15:52:24.3187482Z test_basic_gloo_ckpt_never_find_unused (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:52:24.3366557Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7839 2022-09-27T15:52:24.3372554Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7840 2022-09-27T15:52:25.9102557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:25.9103635Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:25.9104273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:25.9104761Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:25.9655966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:25.9656446Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:25.9659655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:25.9660123Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:26.1434986Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4_9cf3la 2022-09-27T15:52:26.1436748Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4_9cf3la/_remote_module_non_scriptable.py 2022-09-27T15:52:26.1857836Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmdy_hxjk 2022-09-27T15:52:26.1860603Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmdy_hxjk/_remote_module_non_scriptable.py 2022-09-27T15:52:26.5743242Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:52:26.6324103Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:52:27.0433079Z skip: Need at least 4 CUDA devices (4.170s) 2022-09-27T15:52:27.0433329Z 2022-09-27T15:52:27.0433723Z ---------------------------------------------------------------------- 2022-09-27T15:52:27.0434070Z Ran 1 test in 4.170s 2022-09-27T15:52:27.0434235Z 2022-09-27T15:52:27.0434355Z OK (skipped=1) 2022-09-27T15:52:27.0434511Z 2022-09-27T15:52:27.0434639Z Generating XML reports... 2022-09-27T15:52:27.0472569Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220927155222.xml 2022-09-27T15:52:29.0081732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:29.0082406Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:29.0084172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:29.0084652Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:29.2440202Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4rse8rst 2022-09-27T15:52:29.2441908Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4rse8rst/_remote_module_non_scriptable.py 2022-09-27T15:52:29.6878461Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:52:29.6895480Z 2022-09-27T15:52:29.6895918Z Running tests... 2022-09-27T15:52:29.6896384Z ---------------------------------------------------------------------- 2022-09-27T15:52:31.1783983Z test_basic_nccl_ckpt_always (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:52:31.1968559Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7942 2022-09-27T15:52:31.1974897Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7943 2022-09-27T15:52:32.7198139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:32.7198833Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:32.7199731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:32.7200226Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:32.7480856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:32.7481321Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:32.7484370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:32.7484851Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:32.9326830Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpheo2fv1o 2022-09-27T15:52:32.9327451Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpheo2fv1o/_remote_module_non_scriptable.py 2022-09-27T15:52:32.9697305Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcqqmx5e0 2022-09-27T15:52:32.9699829Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcqqmx5e0/_remote_module_non_scriptable.py 2022-09-27T15:52:33.3593032Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:52:33.4142603Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:52:33.8034086Z skip: Need at least 4 CUDA devices (4.114s) 2022-09-27T15:52:33.8034340Z 2022-09-27T15:52:33.8034726Z ---------------------------------------------------------------------- 2022-09-27T15:52:33.8035068Z Ran 1 test in 4.114s 2022-09-27T15:52:33.8035213Z 2022-09-27T15:52:33.8035323Z OK (skipped=1) 2022-09-27T15:52:33.8035477Z 2022-09-27T15:52:33.8035610Z Generating XML reports... 2022-09-27T15:52:33.8071345Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220927155229.xml 2022-09-27T15:52:35.7645441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:35.7646188Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:35.7647875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:35.7648354Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:36.0000830Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpydoonb1z 2022-09-27T15:52:36.0002177Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpydoonb1z/_remote_module_non_scriptable.py 2022-09-27T15:52:36.4399720Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:52:36.4415211Z 2022-09-27T15:52:36.4415464Z Running tests... 2022-09-27T15:52:36.4415902Z ---------------------------------------------------------------------- 2022-09-27T15:52:37.9199370Z test_basic_nccl_ckpt_except_last (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:52:37.9383637Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8045 2022-09-27T15:52:37.9390038Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8046 2022-09-27T15:52:39.5357369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:39.5358385Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:39.5359578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:39.5360539Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:39.5566952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:39.5567894Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:39.5571821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:39.5573176Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:39.7744532Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8vfsn_6u 2022-09-27T15:52:39.7745726Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8vfsn_6u/_remote_module_non_scriptable.py 2022-09-27T15:52:39.7823611Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpacc41jgj 2022-09-27T15:52:39.7826171Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpacc41jgj/_remote_module_non_scriptable.py 2022-09-27T15:52:40.2085895Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:52:40.2180660Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:52:40.5449113Z skip: Need at least 4 CUDA devices (4.103s) 2022-09-27T15:52:40.5449390Z 2022-09-27T15:52:40.5449755Z ---------------------------------------------------------------------- 2022-09-27T15:52:40.5450086Z Ran 1 test in 4.103s 2022-09-27T15:52:40.5450272Z 2022-09-27T15:52:40.5450378Z OK (skipped=1) 2022-09-27T15:52:40.5450532Z 2022-09-27T15:52:40.5450659Z Generating XML reports... 2022-09-27T15:52:40.5485905Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220927155236.xml 2022-09-27T15:52:42.5124375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:42.5124924Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:42.5126881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:42.5127364Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:42.7383165Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1qb9co6r 2022-09-27T15:52:42.7384532Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1qb9co6r/_remote_module_non_scriptable.py 2022-09-27T15:52:43.1643758Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:52:43.1658326Z 2022-09-27T15:52:43.1658803Z Running tests... 2022-09-27T15:52:43.1659310Z ---------------------------------------------------------------------- 2022-09-27T15:52:44.6122753Z test_basic_nccl_ckpt_never (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:52:44.6298485Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8148 2022-09-27T15:52:44.6304699Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8149 2022-09-27T15:52:46.2838733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:46.2839241Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:46.2840692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:46.2841171Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:46.2923398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:46.2923839Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:46.2927300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:46.2927776Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:46.5196379Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2p1_mbg_ 2022-09-27T15:52:46.5198291Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2p1_mbg_/_remote_module_non_scriptable.py 2022-09-27T15:52:46.5267960Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3rwoye0x 2022-09-27T15:52:46.5270572Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3rwoye0x/_remote_module_non_scriptable.py 2022-09-27T15:52:46.9664786Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:52:46.9668124Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:52:47.3364655Z skip: Need at least 4 CUDA devices (4.170s) 2022-09-27T15:52:47.3364926Z 2022-09-27T15:52:47.3365292Z ---------------------------------------------------------------------- 2022-09-27T15:52:47.3365635Z Ran 1 test in 4.171s 2022-09-27T15:52:47.3365800Z 2022-09-27T15:52:47.3365910Z OK (skipped=1) 2022-09-27T15:52:47.3367076Z 2022-09-27T15:52:47.3367236Z Generating XML reports... 2022-09-27T15:52:47.3402452Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220927155243.xml 2022-09-27T15:52:49.3120270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:49.3120779Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:49.3121361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:49.3121834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:49.5393112Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3u6jh3rg 2022-09-27T15:52:49.5394054Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3u6jh3rg/_remote_module_non_scriptable.py 2022-09-27T15:52:49.9686599Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:52:49.9701270Z 2022-09-27T15:52:49.9701529Z Running tests... 2022-09-27T15:52:49.9701977Z ---------------------------------------------------------------------- 2022-09-27T15:52:51.4336138Z test_basic_nccl_ckpt_never_find_unused (__main__.TensorPipePipeWithDDPTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:52:51.4512634Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8251 2022-09-27T15:52:51.4519465Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8252 2022-09-27T15:52:53.0234619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:53.0235155Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:53.0236568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:53.0237045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:53.0461392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:53.0461877Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:53.0464906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:53.0465388Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:53.2621616Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9mlngseu 2022-09-27T15:52:53.2622412Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9mlngseu/_remote_module_non_scriptable.py 2022-09-27T15:52:53.2714653Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1w3q47ce 2022-09-27T15:52:53.2717408Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1w3q47ce/_remote_module_non_scriptable.py 2022-09-27T15:52:53.6939579Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:52:53.7137782Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:52:54.0575035Z skip: Need at least 4 CUDA devices (4.087s) 2022-09-27T15:52:54.0575284Z 2022-09-27T15:52:54.0575656Z ---------------------------------------------------------------------- 2022-09-27T15:52:54.0575997Z Ran 1 test in 4.087s 2022-09-27T15:52:54.0576141Z 2022-09-27T15:52:54.0576252Z OK (skipped=1) 2022-09-27T15:52:54.0576407Z 2022-09-27T15:52:54.0576534Z Generating XML reports... 2022-09-27T15:52:54.0611875Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220927155249.xml 2022-09-27T15:52:55.9813864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:55.9814382Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:55.9815832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:55.9816323Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:56.2110206Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsgpv3fdk 2022-09-27T15:52:56.2111871Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsgpv3fdk/_remote_module_non_scriptable.py 2022-09-27T15:52:56.6357477Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:52:56.6372328Z 2022-09-27T15:52:56.6372706Z Running tests... 2022-09-27T15:52:56.6373200Z ---------------------------------------------------------------------- 2022-09-27T15:52:58.0968188Z test_async_execution_nested_with_cuda_future (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:52:58.1145665Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8354 2022-09-27T15:52:58.1151627Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8355 2022-09-27T15:52:58.1159258Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 8356 2022-09-27T15:52:58.1165485Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 8357 2022-09-27T15:52:59.6977460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:59.6979467Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:59.6980088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:59.6980567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:59.7298082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:59.7298578Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:59.7301634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:59.7302117Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:59.7415333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:59.7415791Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:59.7419356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:59.7419823Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:59.7593358Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:52:59.7593858Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:52:59.7597119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:52:59.7597873Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:52:59.9547781Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8jgzgicm 2022-09-27T15:52:59.9549743Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8jgzgicm/_remote_module_non_scriptable.py 2022-09-27T15:52:59.9686532Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf81e_7vf 2022-09-27T15:52:59.9689371Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf81e_7vf/_remote_module_non_scriptable.py 2022-09-27T15:52:59.9725759Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1gyd65u8 2022-09-27T15:52:59.9728620Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1gyd65u8/_remote_module_non_scriptable.py 2022-09-27T15:52:59.9828182Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9yy5kjp3 2022-09-27T15:52:59.9830822Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9yy5kjp3/_remote_module_non_scriptable.py 2022-09-27T15:53:00.4234390Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:53:00.4355669Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:53:00.4398175Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:53:00.4449637Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:53:00.5451913Z fi_getinfo: -61 2022-09-27T15:53:00.5570759Z fi_getinfo: -61 2022-09-27T15:53:00.5615306Z fi_getinfo: -61 2022-09-27T15:53:00.5664256Z fi_getinfo: -61 2022-09-27T15:53:05.5351387Z ok (8.898s) 2022-09-27T15:53:05.5352088Z 2022-09-27T15:53:05.5352524Z ---------------------------------------------------------------------- 2022-09-27T15:53:05.5352868Z Ran 1 test in 8.898s 2022-09-27T15:53:05.5353031Z 2022-09-27T15:53:05.5353123Z OK 2022-09-27T15:53:05.5353250Z 2022-09-27T15:53:05.5353382Z Generating XML reports... 2022-09-27T15:53:05.5388387Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155256.xml 2022-09-27T15:53:07.5330045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:07.5330532Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:07.5331685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:07.5332165Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:07.7774966Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4d4kj50e 2022-09-27T15:53:07.7776070Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4d4kj50e/_remote_module_non_scriptable.py 2022-09-27T15:53:08.2244099Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:53:08.2259122Z 2022-09-27T15:53:08.2259669Z Running tests... 2022-09-27T15:53:08.2260183Z ---------------------------------------------------------------------- 2022-09-27T15:53:09.7262235Z test_async_execution_with_cuda_future (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:53:09.7446989Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8701 2022-09-27T15:53:09.7453411Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8702 2022-09-27T15:53:09.7460102Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 8703 2022-09-27T15:53:09.7466819Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 8704 2022-09-27T15:53:11.3314216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:11.3314900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:11.3315735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:11.3316194Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:11.3513595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:11.3514071Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:11.3517637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:11.3518103Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:11.3588843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:11.3589316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:11.3593018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:11.3593507Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:11.3888300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:11.3888765Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:11.3892293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:11.3892773Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:11.5880983Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4n6labc7 2022-09-27T15:53:11.5882278Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4n6labc7/_remote_module_non_scriptable.py 2022-09-27T15:53:11.5905011Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphgrgn0g2 2022-09-27T15:53:11.5906092Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjf_847m8 2022-09-27T15:53:11.5907153Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphgrgn0g2/_remote_module_non_scriptable.py 2022-09-27T15:53:11.5908108Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjf_847m8/_remote_module_non_scriptable.py 2022-09-27T15:53:11.6205773Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp01d25zu4 2022-09-27T15:53:11.6208176Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp01d25zu4/_remote_module_non_scriptable.py 2022-09-27T15:53:12.0339339Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:53:12.0346927Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:53:12.0348114Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:53:12.0732345Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:53:12.1559158Z fi_getinfo: -61 2022-09-27T15:53:12.1575117Z fi_getinfo: -61 2022-09-27T15:53:12.1579982Z fi_getinfo: -61 2022-09-27T15:53:12.1946838Z fi_getinfo: -61 2022-09-27T15:53:19.6678403Z ok (11.442s) 2022-09-27T15:53:19.6678640Z 2022-09-27T15:53:19.6679029Z ---------------------------------------------------------------------- 2022-09-27T15:53:19.6679355Z Ran 1 test in 11.442s 2022-09-27T15:53:19.6679522Z 2022-09-27T15:53:19.6679618Z OK 2022-09-27T15:53:19.6679755Z 2022-09-27T15:53:19.6679886Z Generating XML reports... 2022-09-27T15:53:19.6715819Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155308.xml 2022-09-27T15:53:21.6352364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:21.6352872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:21.6355260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:21.6355751Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:21.8639927Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr8vqtqlw 2022-09-27T15:53:21.8640888Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr8vqtqlw/_remote_module_non_scriptable.py 2022-09-27T15:53:22.2963041Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:53:22.2977773Z 2022-09-27T15:53:22.2978317Z Running tests... 2022-09-27T15:53:22.2978828Z ---------------------------------------------------------------------- 2022-09-27T15:53:23.7443914Z test_cuda_future_callback_changes_devices (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:53:23.7621243Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9048 2022-09-27T15:53:23.7627627Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9049 2022-09-27T15:53:23.7634847Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 9050 2022-09-27T15:53:23.7641319Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 9051 2022-09-27T15:53:25.3959802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:25.3960303Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:25.3961651Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:25.3962111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:25.4182588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:25.4183052Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:25.4186267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:25.4186748Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:25.4230595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:25.4231330Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:25.4235058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:25.4235532Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:25.4374435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:25.4374901Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:25.4378589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:25.4379057Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:25.6463094Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxaeljyz3 2022-09-27T15:53:25.6464155Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxaeljyz3/_remote_module_non_scriptable.py 2022-09-27T15:53:25.6490206Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8ht7tcud 2022-09-27T15:53:25.6492740Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8ht7tcud/_remote_module_non_scriptable.py 2022-09-27T15:53:25.6532596Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjzwhmlao 2022-09-27T15:53:25.6534986Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjzwhmlao/_remote_module_non_scriptable.py 2022-09-27T15:53:25.6661859Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc1gmueb0 2022-09-27T15:53:25.6665006Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc1gmueb0/_remote_module_non_scriptable.py 2022-09-27T15:53:26.0934730Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:53:26.0939657Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:53:26.0944337Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:53:26.1141277Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:53:33.2828458Z ok (10.985s) 2022-09-27T15:53:33.2828788Z 2022-09-27T15:53:33.2829407Z ---------------------------------------------------------------------- 2022-09-27T15:53:33.2829759Z Ran 1 test in 10.985s 2022-09-27T15:53:33.2829927Z 2022-09-27T15:53:33.2830003Z OK 2022-09-27T15:53:33.2830139Z 2022-09-27T15:53:33.2830274Z Generating XML reports... 2022-09-27T15:53:33.2868406Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155322.xml 2022-09-27T15:53:35.2465796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:35.2466295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:35.2469179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:35.2469681Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:35.4766482Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2uezw206 2022-09-27T15:53:35.4767321Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2uezw206/_remote_module_non_scriptable.py 2022-09-27T15:53:35.9046548Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:53:35.9061533Z 2022-09-27T15:53:35.9061981Z Running tests... 2022-09-27T15:53:35.9062473Z ---------------------------------------------------------------------- 2022-09-27T15:53:37.3652501Z test_cuda_future_can_extract_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:53:37.3829037Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9227 2022-09-27T15:53:37.3836703Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9228 2022-09-27T15:53:37.3843277Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 9229 2022-09-27T15:53:37.3850032Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 9230 2022-09-27T15:53:38.9784341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:38.9784854Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:38.9786877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:38.9787362Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:38.9796558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:38.9797022Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:38.9797605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:38.9798316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:38.9800852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:38.9801333Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:38.9801926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:38.9802373Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:38.9953801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:38.9954263Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:38.9958760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:38.9959373Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:39.2209442Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0bzcrctz 2022-09-27T15:53:39.2210051Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0bzcrctz/_remote_module_non_scriptable.py 2022-09-27T15:53:39.2213723Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6g6j7ta6 2022-09-27T15:53:39.2216973Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6g6j7ta6/_remote_module_non_scriptable.py 2022-09-27T15:53:39.2296771Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvebhd881 2022-09-27T15:53:39.2299426Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvebhd881/_remote_module_non_scriptable.py 2022-09-27T15:53:39.2340231Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0d27l5_m 2022-09-27T15:53:39.2343208Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0d27l5_m/_remote_module_non_scriptable.py 2022-09-27T15:53:39.6703359Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:53:39.6708882Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:53:39.6745376Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:53:39.6850357Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:53:45.5013447Z ok (9.595s) 2022-09-27T15:53:45.5013685Z 2022-09-27T15:53:45.5014077Z ---------------------------------------------------------------------- 2022-09-27T15:53:45.5014422Z Ran 1 test in 9.595s 2022-09-27T15:53:45.5014570Z 2022-09-27T15:53:45.5014665Z OK 2022-09-27T15:53:45.5014798Z 2022-09-27T15:53:45.5018090Z Generating XML reports... 2022-09-27T15:53:45.5048320Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155335.xml 2022-09-27T15:53:47.4906302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:47.4906862Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:47.4908553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:47.4909039Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:47.7298052Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw3qvxw47 2022-09-27T15:53:47.7299417Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw3qvxw47/_remote_module_non_scriptable.py 2022-09-27T15:53:48.1725137Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:53:48.1741047Z 2022-09-27T15:53:48.1741521Z Running tests... 2022-09-27T15:53:48.1742039Z ---------------------------------------------------------------------- 2022-09-27T15:53:49.6582401Z test_cuda_future_can_extract_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:53:49.6766973Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9462 2022-09-27T15:53:49.6773940Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9463 2022-09-27T15:53:49.6780771Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 9464 2022-09-27T15:53:49.6787845Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 9465 2022-09-27T15:53:51.2550083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:51.2550594Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:51.2552608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:51.2553097Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:51.2618564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:51.2619026Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:51.2622562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:51.2623039Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:51.3150341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:51.3150973Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:51.3153935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:51.3154417Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:51.3791065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:51.3791570Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:51.3794499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:51.3794970Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:51.4888922Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_s_4cvg6 2022-09-27T15:53:51.4889907Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_s_4cvg6/_remote_module_non_scriptable.py 2022-09-27T15:53:51.4953787Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplqlk98jy 2022-09-27T15:53:51.4957018Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplqlk98jy/_remote_module_non_scriptable.py 2022-09-27T15:53:51.5355001Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkgoubxi1 2022-09-27T15:53:51.5357319Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkgoubxi1/_remote_module_non_scriptable.py 2022-09-27T15:53:51.6064646Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn1wd2my3 2022-09-27T15:53:51.6067337Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn1wd2my3/_remote_module_non_scriptable.py 2022-09-27T15:53:51.9243786Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:53:51.9370839Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:53:51.9751467Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:53:52.0652991Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:53:56.6928490Z ok (8.518s) 2022-09-27T15:53:56.6928730Z 2022-09-27T15:53:56.6929602Z ---------------------------------------------------------------------- 2022-09-27T15:53:56.6929972Z Ran 1 test in 8.519s 2022-09-27T15:53:56.6930139Z 2022-09-27T15:53:56.6930235Z OK 2022-09-27T15:53:56.6930378Z 2022-09-27T15:53:56.6930514Z Generating XML reports... 2022-09-27T15:53:56.6965940Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155348.xml 2022-09-27T15:53:58.6620681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:53:58.6621446Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:53:58.6623853Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:53:58.6624649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:53:58.8925709Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpil9yye5k 2022-09-27T15:53:58.8926530Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpil9yye5k/_remote_module_non_scriptable.py 2022-09-27T15:53:59.3223048Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:53:59.3238591Z 2022-09-27T15:53:59.3239021Z Running tests... 2022-09-27T15:53:59.3239508Z ---------------------------------------------------------------------- 2022-09-27T15:54:00.7696685Z test_cuda_future_can_extract_custom_class_with_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:54:00.7875775Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9637 2022-09-27T15:54:00.7882361Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9638 2022-09-27T15:54:00.7888905Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 9639 2022-09-27T15:54:00.7895787Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 9640 2022-09-27T15:54:02.3849944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:02.3850460Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:02.3851738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:02.3852199Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:02.3853178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:02.3853669Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:02.3854250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:02.3854722Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:02.3857691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:02.3858176Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:02.3858748Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:02.3859211Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:02.3993405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:02.3994075Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:02.3997711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:02.3998522Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:02.6257672Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6e9la2iy 2022-09-27T15:54:02.6258587Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6e9la2iy/_remote_module_non_scriptable.py 2022-09-27T15:54:02.6265874Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8qknwcs6 2022-09-27T15:54:02.6268704Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8qknwcs6/_remote_module_non_scriptable.py 2022-09-27T15:54:02.6324423Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvfgcibqi 2022-09-27T15:54:02.6327215Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvfgcibqi/_remote_module_non_scriptable.py 2022-09-27T15:54:02.6381734Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqshnygo2 2022-09-27T15:54:02.6384843Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqshnygo2/_remote_module_non_scriptable.py 2022-09-27T15:54:03.0749421Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:54:03.0776777Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:54:03.0811159Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:54:03.0920366Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:54:08.9057516Z ok (9.582s) 2022-09-27T15:54:08.9057735Z 2022-09-27T15:54:08.9058126Z ---------------------------------------------------------------------- 2022-09-27T15:54:08.9058473Z Ran 1 test in 9.582s 2022-09-27T15:54:08.9058641Z 2022-09-27T15:54:08.9058732Z OK 2022-09-27T15:54:08.9058865Z 2022-09-27T15:54:08.9061106Z Generating XML reports... 2022-09-27T15:54:08.9097509Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155359.xml 2022-09-27T15:54:10.8758530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:10.8759075Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:10.8761097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:10.8761560Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:11.1126674Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgehqoumu 2022-09-27T15:54:11.1129053Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgehqoumu/_remote_module_non_scriptable.py 2022-09-27T15:54:11.5552000Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:54:11.5568223Z 2022-09-27T15:54:11.5568587Z Running tests... 2022-09-27T15:54:11.5569070Z ---------------------------------------------------------------------- 2022-09-27T15:54:13.0454311Z test_cuda_future_can_extract_custom_class_with_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:54:13.0636983Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9876 2022-09-27T15:54:13.0643881Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9877 2022-09-27T15:54:13.0650489Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 9878 2022-09-27T15:54:13.0657455Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 9879 2022-09-27T15:54:14.6543596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:14.6544136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:14.6545633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:14.6546468Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:14.7029820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:14.7030285Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:14.7032911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:14.7033379Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:14.7045395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:14.7045839Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:14.7049492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:14.7049969Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:14.7407711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:14.7408151Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:14.7410712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:14.7411182Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:14.8976886Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb2q6_nw2 2022-09-27T15:54:14.8978127Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb2q6_nw2/_remote_module_non_scriptable.py 2022-09-27T15:54:14.9312523Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe3psvzpr 2022-09-27T15:54:14.9315217Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe3psvzpr/_remote_module_non_scriptable.py 2022-09-27T15:54:14.9337178Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppjbju2yw 2022-09-27T15:54:14.9339915Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppjbju2yw/_remote_module_non_scriptable.py 2022-09-27T15:54:14.9603881Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6doqx8gf 2022-09-27T15:54:14.9606595Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6doqx8gf/_remote_module_non_scriptable.py 2022-09-27T15:54:15.3528029Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:54:15.3765450Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:54:15.3787959Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:54:15.4035072Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:54:21.1819406Z ok (9.625s) 2022-09-27T15:54:21.1819648Z 2022-09-27T15:54:21.1820038Z ---------------------------------------------------------------------- 2022-09-27T15:54:21.1820378Z Ran 1 test in 9.625s 2022-09-27T15:54:21.1820543Z 2022-09-27T15:54:21.1820638Z OK 2022-09-27T15:54:21.1820754Z 2022-09-27T15:54:21.1820885Z Generating XML reports... 2022-09-27T15:54:21.1856315Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155411.xml 2022-09-27T15:54:23.1366916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:23.1367459Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:23.1368843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:23.1369318Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:23.3662881Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5hq1lq2q 2022-09-27T15:54:23.3663893Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5hq1lq2q/_remote_module_non_scriptable.py 2022-09-27T15:54:23.7908357Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:54:23.7922830Z 2022-09-27T15:54:23.7923244Z Running tests... 2022-09-27T15:54:23.7923718Z ---------------------------------------------------------------------- 2022-09-27T15:54:25.2246979Z test_cuda_future_can_extract_list_with_cuda_sparse_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:54:25.2426621Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10055 2022-09-27T15:54:25.2433586Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10056 2022-09-27T15:54:25.2440543Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 10057 2022-09-27T15:54:25.2446567Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 10058 2022-09-27T15:54:26.8313717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:26.8314699Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:26.8315856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:26.8316764Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:26.8320697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:26.8321655Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:26.8327005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:26.8327998Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:26.8741372Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:26.8742316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:26.8744277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:26.8745234Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:26.9134187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:26.9135080Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:26.9136552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:26.9137401Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:27.0560048Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa_2660t7 2022-09-27T15:54:27.0561153Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa_2660t7/_remote_module_non_scriptable.py 2022-09-27T15:54:27.0740132Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpeih0k6o6 2022-09-27T15:54:27.0742363Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpeih0k6o6/_remote_module_non_scriptable.py 2022-09-27T15:54:27.0939660Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5jaz9p7f 2022-09-27T15:54:27.0941577Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5jaz9p7f/_remote_module_non_scriptable.py 2022-09-27T15:54:27.1403673Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl3f3kwlj 2022-09-27T15:54:27.1406221Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl3f3kwlj/_remote_module_non_scriptable.py 2022-09-27T15:54:27.5083628Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:54:27.5235236Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:54:27.5295274Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:54:27.5893022Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:54:33.3608796Z ok (9.568s) 2022-09-27T15:54:33.3609027Z 2022-09-27T15:54:33.3609691Z ---------------------------------------------------------------------- 2022-09-27T15:54:33.3610049Z Ran 1 test in 9.568s 2022-09-27T15:54:33.3610213Z 2022-09-27T15:54:33.3610705Z OK 2022-09-27T15:54:33.3610942Z 2022-09-27T15:54:33.3611128Z Generating XML reports... 2022-09-27T15:54:33.3645681Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155423.xml 2022-09-27T15:54:35.3295033Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:35.3295517Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:35.3297567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:35.3298051Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:35.5597293Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprkzowjju 2022-09-27T15:54:35.5598660Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprkzowjju/_remote_module_non_scriptable.py 2022-09-27T15:54:35.9868531Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:54:35.9884138Z 2022-09-27T15:54:35.9884498Z Running tests... 2022-09-27T15:54:35.9884993Z ---------------------------------------------------------------------- 2022-09-27T15:54:37.4416124Z test_cuda_future_can_extract_list_with_cuda_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:54:37.4593395Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10290 2022-09-27T15:54:37.4601135Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10291 2022-09-27T15:54:37.4607078Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 10292 2022-09-27T15:54:37.4613560Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 10293 2022-09-27T15:54:39.0633579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:39.0634095Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:39.0635380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:39.0635862Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:39.1029985Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:39.1030457Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:39.1034669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:39.1035149Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:39.1035710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:39.1036168Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:39.1039161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:39.1039904Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:39.1655338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:39.1655823Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:39.1658307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:39.1658783Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:39.3015530Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg4j97x2h 2022-09-27T15:54:39.3016223Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg4j97x2h/_remote_module_non_scriptable.py 2022-09-27T15:54:39.3356999Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4t9f3lis 2022-09-27T15:54:39.3358434Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4t9f3lis/_remote_module_non_scriptable.py 2022-09-27T15:54:39.3405193Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppqoyza2p 2022-09-27T15:54:39.3409365Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppqoyza2p/_remote_module_non_scriptable.py 2022-09-27T15:54:39.3948630Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp34sm520w 2022-09-27T15:54:39.3950530Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp34sm520w/_remote_module_non_scriptable.py 2022-09-27T15:54:39.7337145Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:54:39.7773045Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:54:39.7864727Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:54:39.8458840Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:54:45.1772202Z ok (9.188s) 2022-09-27T15:54:45.1772432Z 2022-09-27T15:54:45.1772822Z ---------------------------------------------------------------------- 2022-09-27T15:54:45.1773149Z Ran 1 test in 9.189s 2022-09-27T15:54:45.1773313Z 2022-09-27T15:54:45.1773411Z OK 2022-09-27T15:54:45.1773551Z 2022-09-27T15:54:45.1773684Z Generating XML reports... 2022-09-27T15:54:45.1809501Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155435.xml 2022-09-27T15:54:47.1473126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:47.1473857Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:47.1476034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:47.1476520Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:47.3759094Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi4ciu5w6 2022-09-27T15:54:47.3760276Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi4ciu5w6/_remote_module_non_scriptable.py 2022-09-27T15:54:47.8000139Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:54:47.8014944Z 2022-09-27T15:54:47.8015355Z Running tests... 2022-09-27T15:54:47.8015863Z ---------------------------------------------------------------------- 2022-09-27T15:54:49.2527404Z test_cuda_future_device_as_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:54:49.2705256Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10465 2022-09-27T15:54:49.2712432Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10466 2022-09-27T15:54:49.2721185Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 10467 2022-09-27T15:54:49.2729411Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 10468 2022-09-27T15:54:50.8790463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:50.8791262Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:50.8792256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:50.8792719Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:50.9061538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:50.9062297Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:50.9064983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:50.9065449Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:50.9066334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:50.9066777Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:50.9071165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:50.9071825Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:50.9348553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:50.9349039Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:50.9352827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:50.9353320Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:51.1297544Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2a0rvnrb 2022-09-27T15:54:51.1298139Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2a0rvnrb/_remote_module_non_scriptable.py 2022-09-27T15:54:51.1438626Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqvr7g115 2022-09-27T15:54:51.1441044Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqvr7g115/_remote_module_non_scriptable.py 2022-09-27T15:54:51.1442052Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwegp61w9 2022-09-27T15:54:51.1446954Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwegp61w9/_remote_module_non_scriptable.py 2022-09-27T15:54:51.1583289Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcqncr_ko 2022-09-27T15:54:51.1585882Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcqncr_ko/_remote_module_non_scriptable.py 2022-09-27T15:54:51.5727243Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:54:51.5861189Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:54:51.5865379Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:54:51.5966293Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:54:51.9801016Z ok (4.178s) 2022-09-27T15:54:51.9801239Z 2022-09-27T15:54:51.9801629Z ---------------------------------------------------------------------- 2022-09-27T15:54:51.9801974Z Ran 1 test in 4.178s 2022-09-27T15:54:51.9802121Z 2022-09-27T15:54:51.9802237Z OK 2022-09-27T15:54:51.9805853Z 2022-09-27T15:54:51.9806250Z Generating XML reports... 2022-09-27T15:54:51.9837908Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155447.xml 2022-09-27T15:54:53.9306708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:53.9307190Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:53.9309568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:53.9310056Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:54.1601604Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo_z0lhr5 2022-09-27T15:54:54.1602760Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo_z0lhr5/_remote_module_non_scriptable.py 2022-09-27T15:54:54.5929947Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:54:54.5944840Z 2022-09-27T15:54:54.5945324Z Running tests... 2022-09-27T15:54:54.5945831Z ---------------------------------------------------------------------- 2022-09-27T15:54:56.0321552Z test_cuda_future_device_as_int (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:54:56.0497322Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10636 2022-09-27T15:54:56.0503854Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10637 2022-09-27T15:54:56.0510062Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 10638 2022-09-27T15:54:56.0516743Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 10639 2022-09-27T15:54:57.6362056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:57.6362551Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:57.6363150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:57.6363603Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:57.6364182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:57.6364637Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:57.6366063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:57.6366534Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:57.6791816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:57.6792260Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:57.6795909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:57.6796386Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:57.6950440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:54:57.6951095Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:54:57.6954954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:54:57.6955422Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:54:57.8930808Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcduxs_v2 2022-09-27T15:54:57.8931361Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1knbr6_d 2022-09-27T15:54:57.8931920Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcduxs_v2/_remote_module_non_scriptable.py 2022-09-27T15:54:57.8932859Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1knbr6_d/_remote_module_non_scriptable.py 2022-09-27T15:54:57.9064964Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxyxunjwt 2022-09-27T15:54:57.9067666Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxyxunjwt/_remote_module_non_scriptable.py 2022-09-27T15:54:57.9244770Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd8lfk5vs 2022-09-27T15:54:57.9247877Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd8lfk5vs/_remote_module_non_scriptable.py 2022-09-27T15:54:58.3364242Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:54:58.3381698Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:54:58.3534775Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:54:58.3778403Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:54:58.8588525Z ok (4.264s) 2022-09-27T15:54:58.8588753Z 2022-09-27T15:54:58.8589340Z ---------------------------------------------------------------------- 2022-09-27T15:54:58.8589904Z Ran 1 test in 4.264s 2022-09-27T15:54:58.8590086Z 2022-09-27T15:54:58.8590188Z OK 2022-09-27T15:54:58.8590327Z 2022-09-27T15:54:58.8590464Z Generating XML reports... 2022-09-27T15:54:58.8628374Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155454.xml 2022-09-27T15:55:00.7862297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:00.7862826Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:00.7864475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:00.7864955Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:01.0170703Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2b87nm_f 2022-09-27T15:55:01.0171560Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2b87nm_f/_remote_module_non_scriptable.py 2022-09-27T15:55:01.4511883Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:55:01.4526808Z 2022-09-27T15:55:01.4527082Z Running tests... 2022-09-27T15:55:01.4527520Z ---------------------------------------------------------------------- 2022-09-27T15:55:02.9088982Z test_cuda_future_device_as_str (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:55:02.9263589Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10807 2022-09-27T15:55:02.9269916Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10808 2022-09-27T15:55:02.9276579Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 10809 2022-09-27T15:55:02.9282995Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 10810 2022-09-27T15:55:04.5096230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:04.5096827Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:04.5098199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:04.5099151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:04.5318700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:04.5319367Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:04.5323754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:04.5324319Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:04.5552407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:04.5553318Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:04.5556774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:04.5557542Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:04.5847652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:04.5848339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:04.5851718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:04.5852494Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:04.7644116Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpthch2hvp 2022-09-27T15:55:04.7645231Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpthch2hvp/_remote_module_non_scriptable.py 2022-09-27T15:55:04.7716415Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn4krjjw0 2022-09-27T15:55:04.7719472Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn4krjjw0/_remote_module_non_scriptable.py 2022-09-27T15:55:04.7782745Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkepf1byo 2022-09-27T15:55:04.7785634Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkepf1byo/_remote_module_non_scriptable.py 2022-09-27T15:55:04.8103901Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6ov6cere 2022-09-27T15:55:04.8106742Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6ov6cere/_remote_module_non_scriptable.py 2022-09-27T15:55:05.2064101Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:55:05.2270354Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:55:05.2300228Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:55:05.2622171Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:55:05.6362903Z ok (4.183s) 2022-09-27T15:55:05.6363330Z 2022-09-27T15:55:05.6364084Z ---------------------------------------------------------------------- 2022-09-27T15:55:05.6364603Z Ran 1 test in 4.183s 2022-09-27T15:55:05.6364753Z 2022-09-27T15:55:05.6364869Z OK 2022-09-27T15:55:05.6364998Z 2022-09-27T15:55:05.6365136Z Generating XML reports... 2022-09-27T15:55:05.6401594Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155501.xml 2022-09-27T15:55:07.6340714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:07.6341231Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:07.6342613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:07.6343419Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:07.8646014Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp08au13e_ 2022-09-27T15:55:07.8647262Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp08au13e_/_remote_module_non_scriptable.py 2022-09-27T15:55:08.2905458Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:55:08.2919718Z 2022-09-27T15:55:08.2920035Z Running tests... 2022-09-27T15:55:08.2921364Z ---------------------------------------------------------------------- 2022-09-27T15:55:09.7446429Z test_cuda_future_device_not_cuda (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:55:09.7621841Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10978 2022-09-27T15:55:09.7627882Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10979 2022-09-27T15:55:09.7634422Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 10980 2022-09-27T15:55:09.7640911Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 10981 2022-09-27T15:55:11.3714941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:11.3715846Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:11.3717109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:11.3717744Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:11.3737770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:11.3738242Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:11.3741870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:11.3742353Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:11.3783424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:11.3783899Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:11.3787626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:11.3788107Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:11.3910407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:11.3911113Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:11.3914697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:11.3915312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:11.6105360Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp162gx_iv 2022-09-27T15:55:11.6106366Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp162gx_iv/_remote_module_non_scriptable.py 2022-09-27T15:55:11.6196484Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl5x79i0j 2022-09-27T15:55:11.6199397Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl5x79i0j/_remote_module_non_scriptable.py 2022-09-27T15:55:11.6200188Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8w0un81b 2022-09-27T15:55:11.6203048Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8w0un81b/_remote_module_non_scriptable.py 2022-09-27T15:55:11.6286116Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpak587d5a 2022-09-27T15:55:11.6289036Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpak587d5a/_remote_module_non_scriptable.py 2022-09-27T15:55:12.0585111Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:55:12.0687946Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:55:12.0771172Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:55:12.0836216Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:55:12.4717106Z ok (4.179s) 2022-09-27T15:55:12.4717458Z 2022-09-27T15:55:12.4718269Z ---------------------------------------------------------------------- 2022-09-27T15:55:12.4718731Z Ran 1 test in 4.180s 2022-09-27T15:55:12.4718904Z 2022-09-27T15:55:12.4718982Z OK 2022-09-27T15:55:12.4719119Z 2022-09-27T15:55:12.4719251Z Generating XML reports... 2022-09-27T15:55:12.4756056Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155508.xml 2022-09-27T15:55:14.4524746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:14.4525255Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:14.4526576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:14.4527245Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:14.6914842Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp80qsvn_o 2022-09-27T15:55:14.6915834Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp80qsvn_o/_remote_module_non_scriptable.py 2022-09-27T15:55:15.1364217Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:55:15.1379404Z 2022-09-27T15:55:15.1379679Z Running tests... 2022-09-27T15:55:15.1380416Z ---------------------------------------------------------------------- 2022-09-27T15:55:16.6141427Z test_cuda_future_modify_tensor_inplace (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:55:16.6324415Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11149 2022-09-27T15:55:16.6330937Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11150 2022-09-27T15:55:16.6337489Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 11151 2022-09-27T15:55:16.6343904Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 11152 2022-09-27T15:55:18.2407204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:18.2407739Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:18.2409072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:18.2409559Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:18.2538147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:18.2538643Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:18.2542094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:18.2542578Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:18.2915771Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:18.2916253Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:18.2919169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:18.2919647Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:18.3598128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:18.3598612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:18.3600528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:18.3601331Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:18.4941911Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbdc0rtbh 2022-09-27T15:55:18.4942496Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbdc0rtbh/_remote_module_non_scriptable.py 2022-09-27T15:55:18.4943037Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpczzoe6yh 2022-09-27T15:55:18.4946545Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpczzoe6yh/_remote_module_non_scriptable.py 2022-09-27T15:55:18.5112498Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm7xg3jc7 2022-09-27T15:55:18.5115841Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm7xg3jc7/_remote_module_non_scriptable.py 2022-09-27T15:55:18.5866015Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5fflrig1 2022-09-27T15:55:18.5868228Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5fflrig1/_remote_module_non_scriptable.py 2022-09-27T15:55:18.9492690Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:55:18.9524091Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:55:18.9524770Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:55:19.0443264Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:55:20.9444733Z ok (5.806s) 2022-09-27T15:55:20.9444943Z 2022-09-27T15:55:20.9445334Z ---------------------------------------------------------------------- 2022-09-27T15:55:20.9445658Z Ran 1 test in 5.806s 2022-09-27T15:55:20.9445845Z 2022-09-27T15:55:20.9445939Z OK 2022-09-27T15:55:20.9446073Z 2022-09-27T15:55:20.9446207Z Generating XML reports... 2022-09-27T15:55:20.9481643Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155515.xml 2022-09-27T15:55:22.8944061Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:22.8944587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:22.8946899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:22.8947397Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:23.1257692Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplfxchq0o 2022-09-27T15:55:23.1258542Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplfxchq0o/_remote_module_non_scriptable.py 2022-09-27T15:55:23.5481831Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:55:23.5496430Z 2022-09-27T15:55:23.5496809Z Running tests... 2022-09-27T15:55:23.5497311Z ---------------------------------------------------------------------- 2022-09-27T15:55:25.0235141Z test_cuda_future_replace_tensor (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:55:25.0413500Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11324 2022-09-27T15:55:25.0419555Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11325 2022-09-27T15:55:25.0425863Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 11326 2022-09-27T15:55:25.0432551Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 11327 2022-09-27T15:55:26.6142712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:26.6143235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:26.6144826Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:26.6145334Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:26.6488812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:26.6489271Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:26.6492845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:26.6493337Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:26.6890706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:26.6891404Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:26.6894551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:26.6895168Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:26.7007669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:26.7008146Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:26.7010913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:26.7011394Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:26.8568572Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0iafieas 2022-09-27T15:55:26.8569168Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0iafieas/_remote_module_non_scriptable.py 2022-09-27T15:55:26.8703353Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw92wgpkm 2022-09-27T15:55:26.8706326Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw92wgpkm/_remote_module_non_scriptable.py 2022-09-27T15:55:26.9142523Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpejm4juk1 2022-09-27T15:55:26.9145179Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpejm4juk1/_remote_module_non_scriptable.py 2022-09-27T15:55:26.9246755Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp36dtz1ic 2022-09-27T15:55:26.9249761Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp36dtz1ic/_remote_module_non_scriptable.py 2022-09-27T15:55:27.2957739Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:55:27.3043735Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:55:27.3637111Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:55:27.3688287Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:55:29.3549875Z ok (5.805s) 2022-09-27T15:55:29.3550096Z 2022-09-27T15:55:29.3550488Z ---------------------------------------------------------------------- 2022-09-27T15:55:29.3551063Z Ran 1 test in 5.805s 2022-09-27T15:55:29.3551289Z 2022-09-27T15:55:29.3551464Z OK 2022-09-27T15:55:29.3551669Z 2022-09-27T15:55:29.3551809Z Generating XML reports... 2022-09-27T15:55:29.3587507Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155523.xml 2022-09-27T15:55:31.3242107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:31.3242607Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:31.3244670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:31.3245422Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:31.5629503Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx0andp7f 2022-09-27T15:55:31.5631589Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx0andp7f/_remote_module_non_scriptable.py 2022-09-27T15:55:32.0020812Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:55:32.0037042Z 2022-09-27T15:55:32.0037508Z Running tests... 2022-09-27T15:55:32.0038020Z ---------------------------------------------------------------------- 2022-09-27T15:55:33.4814328Z test_cuda_future_value_on_bad_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:55:33.4990712Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11499 2022-09-27T15:55:33.4997239Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11500 2022-09-27T15:55:33.5004841Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 11501 2022-09-27T15:55:33.5013314Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 11502 2022-09-27T15:55:35.1359323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:35.1359885Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:35.1361597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:35.1362414Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:35.1842474Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:35.1843256Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:35.1846398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:35.1847269Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:35.1854033Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:35.1854820Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:35.1858613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:35.1859422Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:35.1942555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:35.1943365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:35.1946308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:35.1947141Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:35.3776772Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp27dam4zd 2022-09-27T15:55:35.3778194Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp27dam4zd/_remote_module_non_scriptable.py 2022-09-27T15:55:35.4247689Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr1wiqeeg 2022-09-27T15:55:35.4250456Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr1wiqeeg/_remote_module_non_scriptable.py 2022-09-27T15:55:35.4269131Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7l02s120 2022-09-27T15:55:35.4272524Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7l02s120/_remote_module_non_scriptable.py 2022-09-27T15:55:35.4314190Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_hlc_z7c 2022-09-27T15:55:35.4317543Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_hlc_z7c/_remote_module_non_scriptable.py 2022-09-27T15:55:35.8172546Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:55:35.8761231Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:55:35.8804086Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:55:35.8889985Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:55:42.8198486Z ok (10.816s) 2022-09-27T15:55:42.8198684Z 2022-09-27T15:55:42.8199108Z ---------------------------------------------------------------------- 2022-09-27T15:55:42.8199459Z Ran 1 test in 10.816s 2022-09-27T15:55:42.8199980Z 2022-09-27T15:55:42.8200080Z OK 2022-09-27T15:55:42.8200220Z 2022-09-27T15:55:42.8200334Z Generating XML reports... 2022-09-27T15:55:42.8244782Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155531.xml 2022-09-27T15:55:44.7886473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:44.7887001Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:44.7890397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:44.7890883Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:45.0163289Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpps35pijh 2022-09-27T15:55:45.0164203Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpps35pijh/_remote_module_non_scriptable.py 2022-09-27T15:55:45.4425311Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:55:45.4440341Z 2022-09-27T15:55:45.4440516Z Running tests... 2022-09-27T15:55:45.4440977Z ---------------------------------------------------------------------- 2022-09-27T15:55:46.8973212Z test_custom_stream (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:55:46.9126467Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/79750 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.468s) 2022-09-27T15:55:46.9127030Z 2022-09-27T15:55:46.9127324Z ---------------------------------------------------------------------- 2022-09-27T15:55:46.9127665Z Ran 1 test in 1.469s 2022-09-27T15:55:46.9127830Z 2022-09-27T15:55:46.9127944Z OK (skipped=1) 2022-09-27T15:55:46.9128098Z 2022-09-27T15:55:46.9128225Z Generating XML reports... 2022-09-27T15:55:46.9160030Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155545.xml 2022-09-27T15:55:48.8129876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:48.8130360Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:48.8132315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:48.8132794Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:49.0441913Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpja7eo6np 2022-09-27T15:55:49.0443092Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpja7eo6np/_remote_module_non_scriptable.py 2022-09-27T15:55:49.4694705Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:55:49.4709192Z 2022-09-27T15:55:49.4709410Z Running tests... 2022-09-27T15:55:49.4710095Z ---------------------------------------------------------------------- 2022-09-27T15:55:50.9270097Z test_custom_stream_multi (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:55:50.9446742Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11712 2022-09-27T15:55:50.9453032Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11713 2022-09-27T15:55:50.9458918Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 11714 2022-09-27T15:55:50.9465305Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 11715 2022-09-27T15:55:52.5445814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:52.5446617Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:52.5447864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:52.5448361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:52.5828045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:52.5828508Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:52.5832169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:52.5832655Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:52.5990836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:52.5991517Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:52.5995304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:52.5995790Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:52.6143362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:55:52.6143826Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:55:52.6147529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:55:52.6148011Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:55:52.7830700Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo1mqpg4j 2022-09-27T15:55:52.7831517Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo1mqpg4j/_remote_module_non_scriptable.py 2022-09-27T15:55:52.8052594Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5oql5m_s 2022-09-27T15:55:52.8055598Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5oql5m_s/_remote_module_non_scriptable.py 2022-09-27T15:55:52.8337252Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq4i57kxs 2022-09-27T15:55:52.8340208Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq4i57kxs/_remote_module_non_scriptable.py 2022-09-27T15:55:52.8402678Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpypbosnh0 2022-09-27T15:55:52.8405420Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpypbosnh0/_remote_module_non_scriptable.py 2022-09-27T15:55:53.2201533Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:55:53.2419482Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:55:53.2863145Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:55:53.2888542Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:55:53.3417612Z fi_getinfo: -61 2022-09-27T15:55:53.3635243Z fi_getinfo: -61 2022-09-27T15:55:53.4090958Z fi_getinfo: -61 2022-09-27T15:55:53.4104068Z fi_getinfo: -61 2022-09-27T15:56:09.0819643Z ok (19.611s) 2022-09-27T15:56:09.0819875Z 2022-09-27T15:56:09.0823426Z ---------------------------------------------------------------------- 2022-09-27T15:56:09.0823809Z Ran 1 test in 19.611s 2022-09-27T15:56:09.0823988Z 2022-09-27T15:56:09.0824093Z OK 2022-09-27T15:56:09.0824234Z 2022-09-27T15:56:09.0824371Z Generating XML reports... 2022-09-27T15:56:09.0858062Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155549.xml 2022-09-27T15:56:11.0381698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:11.0382248Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:11.0384659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:11.0385150Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:11.2756830Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk9vdg_vi 2022-09-27T15:56:11.2757768Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk9vdg_vi/_remote_module_non_scriptable.py 2022-09-27T15:56:11.7192221Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:56:11.7207685Z 2022-09-27T15:56:11.7207983Z Running tests... 2022-09-27T15:56:11.7208394Z ---------------------------------------------------------------------- 2022-09-27T15:56:13.1975778Z test_custom_stream_nested (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:56:13.2159279Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12071 2022-09-27T15:56:13.2166028Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12072 2022-09-27T15:56:13.2172145Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 12073 2022-09-27T15:56:13.2178815Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 12074 2022-09-27T15:56:14.8650364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:14.8650894Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:14.8652070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:14.8652554Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:14.8757650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:14.8758131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:14.8761963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:14.8762437Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:14.8773525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:14.8773977Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:14.8778286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:14.8778757Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:14.8815871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:14.8816582Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:14.8819745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:14.8820218Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:15.1130800Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp75hjyd2z 2022-09-27T15:56:15.1131572Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp75hjyd2z/_remote_module_non_scriptable.py 2022-09-27T15:56:15.1237823Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6tlpv1qs 2022-09-27T15:56:15.1238454Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2jnuw_og 2022-09-27T15:56:15.1240223Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6tlpv1qs/_remote_module_non_scriptable.py 2022-09-27T15:56:15.1242365Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2jnuw_og/_remote_module_non_scriptable.py 2022-09-27T15:56:15.1306653Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwm7xpyef 2022-09-27T15:56:15.1309424Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwm7xpyef/_remote_module_non_scriptable.py 2022-09-27T15:56:15.5552424Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:56:15.5646444Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:56:15.5709521Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:56:15.5796270Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:56:15.6768490Z fi_getinfo: -61 2022-09-27T15:56:15.6859227Z fi_getinfo: -61 2022-09-27T15:56:15.6921481Z fi_getinfo: -61 2022-09-27T15:56:15.7008326Z fi_getinfo: -61 2022-09-27T15:56:24.8428495Z ok (13.122s) 2022-09-27T15:56:24.8428725Z 2022-09-27T15:56:24.8429161Z ---------------------------------------------------------------------- 2022-09-27T15:56:24.8429505Z Ran 1 test in 13.122s 2022-09-27T15:56:24.8429672Z 2022-09-27T15:56:24.8429748Z OK 2022-09-27T15:56:24.8429886Z 2022-09-27T15:56:24.8431043Z Generating XML reports... 2022-09-27T15:56:24.8465399Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155611.xml 2022-09-27T15:56:26.8129250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:26.8129775Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:26.8131132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:26.8131638Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:27.0494368Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxw0lu1c1 2022-09-27T15:56:27.0495789Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxw0lu1c1/_remote_module_non_scriptable.py 2022-09-27T15:56:27.4879778Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:56:27.4895282Z 2022-09-27T15:56:27.4895738Z Running tests... 2022-09-27T15:56:27.4896361Z ---------------------------------------------------------------------- 2022-09-27T15:56:28.9614740Z test_custom_stream_nested_multi (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:56:28.9798224Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12430 2022-09-27T15:56:28.9804321Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12431 2022-09-27T15:56:28.9810618Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 12432 2022-09-27T15:56:28.9817583Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 12433 2022-09-27T15:56:30.5701478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:30.5701973Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:30.5702828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:30.5703313Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:30.5738272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:30.5738720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:30.5742171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:30.5742967Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:30.5902260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:30.5902704Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:30.5906054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:30.5906525Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:30.5937288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:30.5937738Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:30.5942344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:30.5942820Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:30.7949778Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt2ug2_ei 2022-09-27T15:56:30.7950357Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt2ug2_ei/_remote_module_non_scriptable.py 2022-09-27T15:56:30.8298304Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1t1q0qed 2022-09-27T15:56:30.8300756Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1t1q0qed/_remote_module_non_scriptable.py 2022-09-27T15:56:30.8317567Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfhs4f3nn 2022-09-27T15:56:30.8320663Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfhs4f3nn/_remote_module_non_scriptable.py 2022-09-27T15:56:30.8335793Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvkz5tai8 2022-09-27T15:56:30.8338785Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvkz5tai8/_remote_module_non_scriptable.py 2022-09-27T15:56:31.2282007Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:56:31.2750124Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:56:31.2782347Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:56:31.2836807Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:56:31.3497973Z fi_getinfo: -61 2022-09-27T15:56:31.3964206Z fi_getinfo: -61 2022-09-27T15:56:31.3998235Z fi_getinfo: -61 2022-09-27T15:56:31.4055945Z fi_getinfo: -61 2022-09-27T15:56:39.1047736Z ok (11.615s) 2022-09-27T15:56:39.1047968Z 2022-09-27T15:56:39.1048363Z ---------------------------------------------------------------------- 2022-09-27T15:56:39.1048690Z Ran 1 test in 11.615s 2022-09-27T15:56:39.1048878Z 2022-09-27T15:56:39.1048976Z OK 2022-09-27T15:56:39.1049112Z 2022-09-27T15:56:39.1049247Z Generating XML reports... 2022-09-27T15:56:39.1084698Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155627.xml 2022-09-27T15:56:41.0804278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:41.0804763Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:41.0806967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:41.0807433Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:41.3190311Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj6p6tjzx 2022-09-27T15:56:41.3191684Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj6p6tjzx/_remote_module_non_scriptable.py 2022-09-27T15:56:41.7662595Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:56:41.7678419Z 2022-09-27T15:56:41.7678805Z Running tests... 2022-09-27T15:56:41.7679272Z ---------------------------------------------------------------------- 2022-09-27T15:56:43.2517757Z test_device_map_cpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:56:43.2702548Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12784 2022-09-27T15:56:43.2709131Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12785 2022-09-27T15:56:43.2716899Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 12786 2022-09-27T15:56:43.2723075Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 12787 2022-09-27T15:56:44.8609921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:44.8610457Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:44.8611811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:44.8612292Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:44.8671318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:44.8672002Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:44.8676066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:44.8676555Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:44.8710605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:44.8711266Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:44.8715518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:44.8716003Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:44.8938181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:44.8938641Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:44.8942733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:44.8943220Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:45.1009878Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4x6_5mv1 2022-09-27T15:56:45.1011228Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4x6_5mv1/_remote_module_non_scriptable.py 2022-09-27T15:56:45.1063219Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph_mybldo 2022-09-27T15:56:45.1066512Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph_mybldo/_remote_module_non_scriptable.py 2022-09-27T15:56:45.1072901Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsr_ozz38 2022-09-27T15:56:45.1076194Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsr_ozz38/_remote_module_non_scriptable.py 2022-09-27T15:56:45.1269443Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph90ajkhs 2022-09-27T15:56:45.1271771Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph90ajkhs/_remote_module_non_scriptable.py 2022-09-27T15:56:45.5502403Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:56:45.5573467Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:56:45.5574257Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:56:45.5815762Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:56:45.6522514Z fi_getinfo: -61 2022-09-27T15:56:45.6592000Z fi_getinfo: -61 2022-09-27T15:56:45.6596184Z fi_getinfo: -61 2022-09-27T15:56:45.6834514Z fi_getinfo: -61 2022-09-27T15:56:46.4805766Z ok (4.712s) 2022-09-27T15:56:46.4806001Z 2022-09-27T15:56:46.4806377Z ---------------------------------------------------------------------- 2022-09-27T15:56:46.4806720Z Ran 1 test in 4.713s 2022-09-27T15:56:46.4806882Z 2022-09-27T15:56:46.4806975Z OK 2022-09-27T15:56:46.4807123Z 2022-09-27T15:56:46.4807255Z Generating XML reports... 2022-09-27T15:56:46.4842700Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155641.xml 2022-09-27T15:56:48.4584396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:48.4584891Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:48.4587104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:48.4587596Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:48.6868444Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx2gdbbdo 2022-09-27T15:56:48.6869442Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx2gdbbdo/_remote_module_non_scriptable.py 2022-09-27T15:56:49.1529403Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:56:49.1543486Z 2022-09-27T15:56:49.1543830Z Running tests... 2022-09-27T15:56:49.1544279Z ---------------------------------------------------------------------- 2022-09-27T15:56:50.6188850Z test_device_map_cpu_to_gpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:56:50.6368187Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13127 2022-09-27T15:56:50.6374644Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13128 2022-09-27T15:56:50.6381164Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 13129 2022-09-27T15:56:50.6387689Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 13130 2022-09-27T15:56:52.2546224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:52.2546713Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:52.2549222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:52.2549737Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:52.2732190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:52.2732921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:52.2736377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:52.2736863Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:52.2777423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:52.2777897Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:52.2781311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:52.2781801Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:52.3032576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:52.3033044Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:52.3036624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:52.3037102Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:52.5142217Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps47q4nnc 2022-09-27T15:56:52.5142833Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps47q4nnc/_remote_module_non_scriptable.py 2022-09-27T15:56:52.5164321Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1ruuwft3 2022-09-27T15:56:52.5167271Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1ruuwft3/_remote_module_non_scriptable.py 2022-09-27T15:56:52.5174405Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0ob_bnw5 2022-09-27T15:56:52.5176017Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0ob_bnw5/_remote_module_non_scriptable.py 2022-09-27T15:56:52.5288338Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0j_yafrf 2022-09-27T15:56:52.5291045Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0j_yafrf/_remote_module_non_scriptable.py 2022-09-27T15:56:52.9603519Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:56:52.9604178Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:56:52.9623264Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:56:52.9728729Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:56:53.0830881Z fi_getinfo: -61 2022-09-27T15:56:53.0834484Z fi_getinfo: -61 2022-09-27T15:56:53.0838342Z fi_getinfo: -61 2022-09-27T15:56:53.0943244Z fi_getinfo: -61 2022-09-27T15:56:56.6519967Z ok (7.497s) 2022-09-27T15:56:56.6520175Z 2022-09-27T15:56:56.6520596Z ---------------------------------------------------------------------- 2022-09-27T15:56:56.6520925Z Ran 1 test in 7.498s 2022-09-27T15:56:56.6521091Z 2022-09-27T15:56:56.6521186Z OK 2022-09-27T15:56:56.6521322Z 2022-09-27T15:56:56.6521455Z Generating XML reports... 2022-09-27T15:56:56.6556670Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155649.xml 2022-09-27T15:56:58.6085443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:56:58.6085948Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:56:58.6088174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:56:58.6088678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:56:58.8347537Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0kcrm63o 2022-09-27T15:56:58.8348173Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0kcrm63o/_remote_module_non_scriptable.py 2022-09-27T15:56:59.2577550Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:56:59.2592356Z 2022-09-27T15:56:59.2592806Z Running tests... 2022-09-27T15:56:59.2593302Z ---------------------------------------------------------------------- 2022-09-27T15:57:00.7141715Z test_device_map_cpu_to_gpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:57:00.7320180Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13478 2022-09-27T15:57:00.7326525Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13479 2022-09-27T15:57:00.7333067Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 13480 2022-09-27T15:57:00.7339830Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 13481 2022-09-27T15:57:02.3146600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:02.3147126Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:02.3147937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:02.3148392Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:02.3211244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:02.3211707Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:02.3215030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:02.3215493Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:02.3490564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:02.3491025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:02.3494726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:02.3495185Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:02.4001694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:02.4002176Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:02.4004470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:02.4004944Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:02.5558851Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8fldswkw 2022-09-27T15:57:02.5559905Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8fldswkw/_remote_module_non_scriptable.py 2022-09-27T15:57:02.5581312Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph8f60z6f 2022-09-27T15:57:02.5583827Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph8f60z6f/_remote_module_non_scriptable.py 2022-09-27T15:57:02.5696935Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnszrovao 2022-09-27T15:57:02.5699788Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnszrovao/_remote_module_non_scriptable.py 2022-09-27T15:57:02.6266279Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvj1atl7x 2022-09-27T15:57:02.6268510Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvj1atl7x/_remote_module_non_scriptable.py 2022-09-27T15:57:03.0045881Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:57:03.0092448Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:57:03.0098299Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:57:03.0766132Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:57:03.1263634Z fi_getinfo: -61 2022-09-27T15:57:03.1309150Z fi_getinfo: -61 2022-09-27T15:57:03.1315126Z fi_getinfo: -61 2022-09-27T15:57:03.1981368Z fi_getinfo: -61 2022-09-27T15:57:06.7497676Z ok (7.490s) 2022-09-27T15:57:06.7498072Z 2022-09-27T15:57:06.7498852Z ---------------------------------------------------------------------- 2022-09-27T15:57:06.7499719Z Ran 1 test in 7.490s 2022-09-27T15:57:06.7499889Z 2022-09-27T15:57:06.7499990Z OK 2022-09-27T15:57:06.7500133Z 2022-09-27T15:57:06.7500808Z Generating XML reports... 2022-09-27T15:57:06.7537426Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155659.xml 2022-09-27T15:57:08.7629704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:08.7630225Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:08.7632768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:08.7633241Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:08.9939199Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2wluj_ao 2022-09-27T15:57:08.9940138Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2wluj_ao/_remote_module_non_scriptable.py 2022-09-27T15:57:09.4194133Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:57:09.4209672Z 2022-09-27T15:57:09.4210029Z Running tests... 2022-09-27T15:57:09.4210462Z ---------------------------------------------------------------------- 2022-09-27T15:57:10.8902133Z test_device_map_gpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:57:10.9080683Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13829 2022-09-27T15:57:10.9087683Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13830 2022-09-27T15:57:10.9093945Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 13831 2022-09-27T15:57:10.9100530Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 13832 2022-09-27T15:57:12.5014198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:12.5014741Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:12.5015969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:12.5016433Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:12.5029259Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:12.5029710Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:12.5033453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:12.5033936Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:12.5153455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:12.5153922Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:12.5157839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:12.5158347Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:12.5447177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:12.5447663Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:12.5451336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:12.5451827Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:12.7366704Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp74n6a6qg 2022-09-27T15:57:12.7368087Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp74n6a6qg/_remote_module_non_scriptable.py 2022-09-27T15:57:12.7461963Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp860kzwup 2022-09-27T15:57:12.7464744Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp860kzwup/_remote_module_non_scriptable.py 2022-09-27T15:57:12.7473796Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpax6cgulc 2022-09-27T15:57:12.7476686Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpax6cgulc/_remote_module_non_scriptable.py 2022-09-27T15:57:12.7752524Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqt3tudyj 2022-09-27T15:57:12.7755986Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqt3tudyj/_remote_module_non_scriptable.py 2022-09-27T15:57:13.1820669Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:57:13.1840836Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:57:13.1929702Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:57:13.2258142Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:57:13.3040366Z fi_getinfo: -61 2022-09-27T15:57:13.3052358Z fi_getinfo: -61 2022-09-27T15:57:13.3141026Z fi_getinfo: -61 2022-09-27T15:57:13.3470494Z fi_getinfo: -61 2022-09-27T15:57:16.9236822Z ok (7.502s) 2022-09-27T15:57:16.9237088Z 2022-09-27T15:57:16.9237676Z ---------------------------------------------------------------------- 2022-09-27T15:57:16.9238025Z Ran 1 test in 7.503s 2022-09-27T15:57:16.9238197Z 2022-09-27T15:57:16.9238266Z OK 2022-09-27T15:57:16.9238403Z 2022-09-27T15:57:16.9238541Z Generating XML reports... 2022-09-27T15:57:16.9273288Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155709.xml 2022-09-27T15:57:18.8897714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:18.8898730Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:18.8899935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:18.8900901Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:19.1264877Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpinoi0tcl 2022-09-27T15:57:19.1265778Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpinoi0tcl/_remote_module_non_scriptable.py 2022-09-27T15:57:19.5616422Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:57:19.5631241Z 2022-09-27T15:57:19.5631577Z Running tests... 2022-09-27T15:57:19.5632024Z ---------------------------------------------------------------------- 2022-09-27T15:57:21.0187142Z test_device_map_gpu_default_to_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:57:21.0340760Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/80008 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.471s) 2022-09-27T15:57:21.0341873Z 2022-09-27T15:57:21.0342453Z ---------------------------------------------------------------------- 2022-09-27T15:57:21.0343114Z Ran 1 test in 1.471s 2022-09-27T15:57:21.0343434Z 2022-09-27T15:57:21.0343646Z OK (skipped=1) 2022-09-27T15:57:21.0343911Z 2022-09-27T15:57:21.0344172Z Generating XML reports... 2022-09-27T15:57:21.0374721Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155719.xml 2022-09-27T15:57:22.9877233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:22.9877762Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:22.9879173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:22.9879657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:23.2165106Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg47vx53u 2022-09-27T15:57:23.2166756Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg47vx53u/_remote_module_non_scriptable.py 2022-09-27T15:57:23.6443792Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:57:23.6458173Z 2022-09-27T15:57:23.6458407Z Running tests... 2022-09-27T15:57:23.6458972Z ---------------------------------------------------------------------- 2022-09-27T15:57:25.0932503Z test_device_map_gpu_mixed_1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:57:25.1108500Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14210 2022-09-27T15:57:25.1114858Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14211 2022-09-27T15:57:25.1120935Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 14212 2022-09-27T15:57:25.1127479Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 14213 2022-09-27T15:57:26.6972307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:26.6972825Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:26.6973927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:26.6974501Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:26.6977074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:26.6977815Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:26.6981044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:26.6981793Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:26.7453503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:26.7453971Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:26.7457625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:26.7458358Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:26.7907014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:26.7907932Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:26.7909082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:26.7909560Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:26.9198615Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsiau4sgw 2022-09-27T15:57:26.9199296Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsiau4sgw/_remote_module_non_scriptable.py 2022-09-27T15:57:26.9374897Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp3at5v3_ 2022-09-27T15:57:26.9377693Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp3at5v3_/_remote_module_non_scriptable.py 2022-09-27T15:57:26.9660036Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4gv7uwo6 2022-09-27T15:57:26.9662741Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4gv7uwo6/_remote_module_non_scriptable.py 2022-09-27T15:57:27.0157498Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpooj2qijv 2022-09-27T15:57:27.0159801Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpooj2qijv/_remote_module_non_scriptable.py 2022-09-27T15:57:27.3600879Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:57:27.3799772Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:57:27.4054305Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:57:27.4639972Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:57:27.4817542Z fi_getinfo: -61 2022-09-27T15:57:27.5016869Z fi_getinfo: -61 2022-09-27T15:57:27.5269672Z fi_getinfo: -61 2022-09-27T15:57:27.5853423Z fi_getinfo: -61 2022-09-27T15:57:33.4311852Z ok (9.785s) 2022-09-27T15:57:33.4312080Z 2022-09-27T15:57:33.4312480Z ---------------------------------------------------------------------- 2022-09-27T15:57:33.4312816Z Ran 1 test in 9.785s 2022-09-27T15:57:33.4312980Z 2022-09-27T15:57:33.4313074Z OK 2022-09-27T15:57:33.4313208Z 2022-09-27T15:57:33.4313347Z Generating XML reports... 2022-09-27T15:57:33.4349427Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155723.xml 2022-09-27T15:57:35.4002291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:35.4002807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:35.4004575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:35.4005081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:35.6366631Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6jbhzjud 2022-09-27T15:57:35.6368046Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6jbhzjud/_remote_module_non_scriptable.py 2022-09-27T15:57:36.0737292Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:57:36.0753263Z 2022-09-27T15:57:36.0753598Z Running tests... 2022-09-27T15:57:36.0754033Z ---------------------------------------------------------------------- 2022-09-27T15:57:37.5669793Z test_device_map_gpu_mixed_2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:57:37.5855064Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14561 2022-09-27T15:57:37.5861209Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14562 2022-09-27T15:57:37.5867760Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 14563 2022-09-27T15:57:37.5874925Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 14564 2022-09-27T15:57:39.1660401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:39.1660902Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:39.1662370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:39.1662869Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:39.1746758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:39.1747218Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:39.1750655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:39.1751366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:39.1841962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:39.1842414Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:39.1846551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:39.1847015Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:39.2229179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:39.2229642Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:39.2232964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:39.2233434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:39.4079173Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp3hxik_j 2022-09-27T15:57:39.4080374Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp3hxik_j/_remote_module_non_scriptable.py 2022-09-27T15:57:39.4204074Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmwo_51h9 2022-09-27T15:57:39.4206269Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmwo_51h9/_remote_module_non_scriptable.py 2022-09-27T15:57:39.4236711Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6kw42gpc 2022-09-27T15:57:39.4239788Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6kw42gpc/_remote_module_non_scriptable.py 2022-09-27T15:57:39.4416115Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmogl3fg_ 2022-09-27T15:57:39.4418589Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmogl3fg_/_remote_module_non_scriptable.py 2022-09-27T15:57:39.8617763Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:57:39.8744634Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:57:39.8750069Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:57:39.8841816Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:57:39.9844854Z fi_getinfo: -61 2022-09-27T15:57:39.9959129Z fi_getinfo: -61 2022-09-27T15:57:39.9964665Z fi_getinfo: -61 2022-09-27T15:57:40.0054824Z fi_getinfo: -61 2022-09-27T15:57:45.8057457Z ok (9.730s) 2022-09-27T15:57:45.8057699Z 2022-09-27T15:57:45.8058129Z ---------------------------------------------------------------------- 2022-09-27T15:57:45.8058521Z Ran 1 test in 9.730s 2022-09-27T15:57:45.8058671Z 2022-09-27T15:57:45.8068571Z OK 2022-09-27T15:57:45.8068848Z 2022-09-27T15:57:45.8069064Z Generating XML reports... 2022-09-27T15:57:45.8103122Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155736.xml 2022-09-27T15:57:47.7948484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:47.7949029Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:47.7950551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:47.7951550Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:48.0244041Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5m5a_tvn 2022-09-27T15:57:48.0245269Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5m5a_tvn/_remote_module_non_scriptable.py 2022-09-27T15:57:48.4527016Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:57:48.4542032Z 2022-09-27T15:57:48.4542495Z Running tests... 2022-09-27T15:57:48.4542966Z ---------------------------------------------------------------------- 2022-09-27T15:57:49.8897145Z test_device_map_gpu_mixed_3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:57:49.9075029Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14912 2022-09-27T15:57:49.9081978Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14913 2022-09-27T15:57:49.9088748Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 14914 2022-09-27T15:57:49.9095278Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 14915 2022-09-27T15:57:51.5067620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:51.5068171Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:51.5069371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:51.5069853Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:51.5131828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:51.5132307Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:51.5135610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:51.5136090Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:51.5761674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:51.5762197Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:51.5764731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:51.5765214Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:51.5943445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:57:51.5943962Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:57:51.5946137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:57:51.5946614Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:57:51.7351921Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcyr30p34 2022-09-27T15:57:51.7353010Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcyr30p34/_remote_module_non_scriptable.py 2022-09-27T15:57:51.7465779Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf_u9yqfb 2022-09-27T15:57:51.7468293Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf_u9yqfb/_remote_module_non_scriptable.py 2022-09-27T15:57:51.7993584Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkbrp_2gi 2022-09-27T15:57:51.7996775Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkbrp_2gi/_remote_module_non_scriptable.py 2022-09-27T15:57:51.8216525Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjr2ypd6l 2022-09-27T15:57:51.8219304Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjr2ypd6l/_remote_module_non_scriptable.py 2022-09-27T15:57:52.1660776Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:57:52.1836772Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:57:52.2378751Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:57:52.2692151Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:57:52.2873454Z fi_getinfo: -61 2022-09-27T15:57:52.3051038Z fi_getinfo: -61 2022-09-27T15:57:52.3600193Z fi_getinfo: -61 2022-09-27T15:57:52.3906664Z fi_getinfo: -61 2022-09-27T15:57:58.1273500Z ok (9.673s) 2022-09-27T15:57:58.1273880Z 2022-09-27T15:57:58.1274325Z ---------------------------------------------------------------------- 2022-09-27T15:57:58.1274653Z Ran 1 test in 9.673s 2022-09-27T15:57:58.1274817Z 2022-09-27T15:57:58.1274945Z OK 2022-09-27T15:57:58.1275202Z 2022-09-27T15:57:58.1275416Z Generating XML reports... 2022-09-27T15:57:58.1310289Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155748.xml 2022-09-27T15:58:00.1015768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:00.1016309Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:00.1017627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:00.1018110Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:00.3396944Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpegmttfrn 2022-09-27T15:58:00.3398338Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpegmttfrn/_remote_module_non_scriptable.py 2022-09-27T15:58:00.7786981Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:58:00.7802828Z 2022-09-27T15:58:00.7803228Z Running tests... 2022-09-27T15:58:00.7803711Z ---------------------------------------------------------------------- 2022-09-27T15:58:02.2687977Z test_device_map_gpu_mixed_4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:58:02.2866085Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15263 2022-09-27T15:58:02.2872234Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15264 2022-09-27T15:58:02.2879004Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 15265 2022-09-27T15:58:02.2885328Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 15266 2022-09-27T15:58:03.8757037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:03.8757539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:03.8758871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:03.8759524Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:03.9078238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:03.9078748Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:03.9081524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:03.9082260Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:03.9114792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:03.9115439Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:03.9119124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:03.9119949Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:03.9231319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:03.9232223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:03.9235890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:03.9236609Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:04.1254143Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_3fm5d3n 2022-09-27T15:58:04.1255228Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_3fm5d3n/_remote_module_non_scriptable.py 2022-09-27T15:58:04.1423045Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp414ze_bw 2022-09-27T15:58:04.1425866Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp414ze_bw/_remote_module_non_scriptable.py 2022-09-27T15:58:04.1476179Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpst13120w 2022-09-27T15:58:04.1479179Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpst13120w/_remote_module_non_scriptable.py 2022-09-27T15:58:04.1536259Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcq73hgnw 2022-09-27T15:58:04.1539358Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcq73hgnw/_remote_module_non_scriptable.py 2022-09-27T15:58:04.5647103Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:58:04.5982795Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:58:04.6044098Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:58:04.6055650Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:58:04.6862708Z fi_getinfo: -61 2022-09-27T15:58:04.7196021Z fi_getinfo: -61 2022-09-27T15:58:04.7257404Z fi_getinfo: -61 2022-09-27T15:58:04.7268817Z fi_getinfo: -61 2022-09-27T15:58:10.6092449Z ok (9.829s) 2022-09-27T15:58:10.6092683Z 2022-09-27T15:58:10.6093108Z ---------------------------------------------------------------------- 2022-09-27T15:58:10.6093455Z Ran 1 test in 9.829s 2022-09-27T15:58:10.6093622Z 2022-09-27T15:58:10.6093721Z OK 2022-09-27T15:58:10.6093856Z 2022-09-27T15:58:10.6093984Z Generating XML reports... 2022-09-27T15:58:10.6130900Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155800.xml 2022-09-27T15:58:12.5625926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:12.5626428Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:12.5628031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:12.5628521Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:12.7907823Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuq_hp65a 2022-09-27T15:58:12.7908674Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuq_hp65a/_remote_module_non_scriptable.py 2022-09-27T15:58:13.2148137Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:58:13.2163188Z 2022-09-27T15:58:13.2163467Z Running tests... 2022-09-27T15:58:13.2163968Z ---------------------------------------------------------------------- 2022-09-27T15:58:14.6528665Z test_device_map_gpu_mixed_5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:58:14.6705632Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15614 2022-09-27T15:58:14.6711633Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15615 2022-09-27T15:58:14.6718164Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 15616 2022-09-27T15:58:14.6724416Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 15617 2022-09-27T15:58:16.2632755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:16.2633273Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:16.2635175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:16.2635663Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:16.2682951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:16.2683396Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:16.2686940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:16.2687421Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:16.3348082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:16.3349057Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:16.3350273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:16.3351592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:16.3498031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:16.3498884Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:16.3500042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:16.3500576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:16.4935448Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt3ip0_13 2022-09-27T15:58:16.4936304Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt3ip0_13/_remote_module_non_scriptable.py 2022-09-27T15:58:16.5085652Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmply83v9ea 2022-09-27T15:58:16.5088868Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmply83v9ea/_remote_module_non_scriptable.py 2022-09-27T15:58:16.5680420Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpysgk5h1k 2022-09-27T15:58:16.5683049Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpysgk5h1k/_remote_module_non_scriptable.py 2022-09-27T15:58:16.5785094Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptnz28irq 2022-09-27T15:58:16.5787307Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptnz28irq/_remote_module_non_scriptable.py 2022-09-27T15:58:16.9370483Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:58:16.9497900Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:58:17.0205561Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:58:17.0240631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:58:17.0584451Z fi_getinfo: -61 2022-09-27T15:58:17.0712045Z fi_getinfo: -61 2022-09-27T15:58:17.1424670Z fi_getinfo: -61 2022-09-27T15:58:17.1455739Z fi_getinfo: -61 2022-09-27T15:58:22.9936796Z ok (9.777s) 2022-09-27T15:58:22.9937000Z 2022-09-27T15:58:22.9937386Z ---------------------------------------------------------------------- 2022-09-27T15:58:22.9938076Z Ran 1 test in 9.777s 2022-09-27T15:58:22.9938247Z 2022-09-27T15:58:22.9938336Z OK 2022-09-27T15:58:22.9938469Z 2022-09-27T15:58:22.9938588Z Generating XML reports... 2022-09-27T15:58:22.9974416Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155813.xml 2022-09-27T15:58:24.9799882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:24.9800390Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:24.9801757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:24.9802214Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:25.2129035Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx0u2kkkh 2022-09-27T15:58:25.2129948Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx0u2kkkh/_remote_module_non_scriptable.py 2022-09-27T15:58:25.6466599Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:58:25.6482033Z 2022-09-27T15:58:25.6482322Z Running tests... 2022-09-27T15:58:25.6482755Z ---------------------------------------------------------------------- 2022-09-27T15:58:27.0978236Z test_device_map_gpu_mixed_6 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:58:27.1156668Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15973 2022-09-27T15:58:27.1163145Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15974 2022-09-27T15:58:27.1170026Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 15975 2022-09-27T15:58:27.1177349Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 15976 2022-09-27T15:58:28.7046767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:28.7047319Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:28.7049386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:28.7049873Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:28.7107226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:28.7107684Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:28.7111217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:28.7111896Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:28.7331755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:28.7332235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:28.7335846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:28.7336335Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:28.7368205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:28.7368656Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:28.7372562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:28.7373026Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:28.9437937Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo7kt_n9t 2022-09-27T15:58:28.9439099Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo7kt_n9t/_remote_module_non_scriptable.py 2022-09-27T15:58:28.9563375Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkd8hrbu3 2022-09-27T15:58:28.9565870Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkd8hrbu3/_remote_module_non_scriptable.py 2022-09-27T15:58:28.9605942Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo9i0n12s 2022-09-27T15:58:28.9608794Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo9i0n12s/_remote_module_non_scriptable.py 2022-09-27T15:58:28.9680678Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgm2gwfvd 2022-09-27T15:58:28.9683823Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgm2gwfvd/_remote_module_non_scriptable.py 2022-09-27T15:58:29.3946690Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:58:29.4040168Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:58:29.4103867Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:58:29.4167890Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:58:29.5161693Z fi_getinfo: -61 2022-09-27T15:58:29.5251515Z fi_getinfo: -61 2022-09-27T15:58:29.5319661Z fi_getinfo: -61 2022-09-27T15:58:29.5381113Z fi_getinfo: -61 2022-09-27T15:58:35.4388130Z ok (9.790s) 2022-09-27T15:58:35.4388348Z 2022-09-27T15:58:35.4388745Z ---------------------------------------------------------------------- 2022-09-27T15:58:35.4389090Z Ran 1 test in 9.791s 2022-09-27T15:58:35.4389238Z 2022-09-27T15:58:35.4389331Z OK 2022-09-27T15:58:35.4389466Z 2022-09-27T15:58:35.4389599Z Generating XML reports... 2022-09-27T15:58:35.4426313Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155825.xml 2022-09-27T15:58:37.4022813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:37.4023351Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:37.4024176Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:37.4024641Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:37.6311416Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptrierqzu 2022-09-27T15:58:37.6312036Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptrierqzu/_remote_module_non_scriptable.py 2022-09-27T15:58:38.0541221Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:58:38.0555671Z 2022-09-27T15:58:38.0556169Z Running tests... 2022-09-27T15:58:38.0556832Z ---------------------------------------------------------------------- 2022-09-27T15:58:39.5126203Z test_device_map_gpu_mixed_7 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:58:39.5302121Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16332 2022-09-27T15:58:39.5308784Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16333 2022-09-27T15:58:39.5315507Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 16334 2022-09-27T15:58:39.5322139Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 16335 2022-09-27T15:58:41.1103388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:41.1103876Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:41.1105283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:41.1106005Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:41.1199097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:41.1199688Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:41.1202810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:41.1203312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:41.1213675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:41.1214150Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:41.1217514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:41.1217995Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:41.1380526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:41.1381006Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:41.1384436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:41.1384966Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:41.3622522Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz5484dfm 2022-09-27T15:58:41.3623410Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz5484dfm/_remote_module_non_scriptable.py 2022-09-27T15:58:41.3640848Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4ua1e8_e 2022-09-27T15:58:41.3645132Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4ua1e8_e/_remote_module_non_scriptable.py 2022-09-27T15:58:41.3646063Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzqnrlcnf 2022-09-27T15:58:41.3648759Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzqnrlcnf/_remote_module_non_scriptable.py 2022-09-27T15:58:41.3711483Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdwui8sms 2022-09-27T15:58:41.3714656Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdwui8sms/_remote_module_non_scriptable.py 2022-09-27T15:58:41.8117852Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:58:41.8118722Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:58:41.8119205Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:58:41.8236183Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:58:41.9357195Z fi_getinfo: -61 2022-09-27T15:58:41.9360892Z fi_getinfo: -61 2022-09-27T15:58:41.9365410Z fi_getinfo: -61 2022-09-27T15:58:41.9449226Z fi_getinfo: -61 2022-09-27T15:58:47.7497087Z ok (9.694s) 2022-09-27T15:58:47.7497523Z 2022-09-27T15:58:47.7497980Z ---------------------------------------------------------------------- 2022-09-27T15:58:47.7498331Z Ran 1 test in 9.694s 2022-09-27T15:58:47.7498496Z 2022-09-27T15:58:47.7498599Z OK 2022-09-27T15:58:47.7498715Z 2022-09-27T15:58:47.7498853Z Generating XML reports... 2022-09-27T15:58:47.7534281Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155838.xml 2022-09-27T15:58:49.6987534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:49.6988051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:49.6989548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:49.6990039Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:49.9288440Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpapwv6gr1 2022-09-27T15:58:49.9289580Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpapwv6gr1/_remote_module_non_scriptable.py 2022-09-27T15:58:50.3579928Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:58:50.3595385Z 2022-09-27T15:58:50.3595538Z Running tests... 2022-09-27T15:58:50.3596409Z ---------------------------------------------------------------------- 2022-09-27T15:58:51.8175380Z test_device_map_gpu_mixed_8 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:58:51.8350993Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16691 2022-09-27T15:58:51.8357814Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16692 2022-09-27T15:58:51.8364144Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 16693 2022-09-27T15:58:51.8370492Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 16694 2022-09-27T15:58:53.4731763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:53.4732271Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:53.4733173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:53.4733650Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:53.4979805Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:53.4980262Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:53.4983410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:53.4983892Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:53.5073442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:53.5073893Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:53.5077582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:53.5078050Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:53.5563937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:58:53.5564413Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:58:53.5566794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:58:53.5567536Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:58:53.7240651Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp35xmn_dl 2022-09-27T15:58:53.7241520Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp35xmn_dl/_remote_module_non_scriptable.py 2022-09-27T15:58:53.7244650Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwhbe6bu5 2022-09-27T15:58:53.7247315Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwhbe6bu5/_remote_module_non_scriptable.py 2022-09-27T15:58:53.7324851Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5n3ri1d1 2022-09-27T15:58:53.7327646Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5n3ri1d1/_remote_module_non_scriptable.py 2022-09-27T15:58:53.7770661Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpenu42vea 2022-09-27T15:58:53.7773670Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpenu42vea/_remote_module_non_scriptable.py 2022-09-27T15:58:54.1639980Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:58:54.1667744Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:58:54.1753573Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:58:54.2192134Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:58:54.2855736Z fi_getinfo: -61 2022-09-27T15:58:54.2881721Z fi_getinfo: -61 2022-09-27T15:58:54.2966164Z fi_getinfo: -61 2022-09-27T15:58:54.3408126Z fi_getinfo: -61 2022-09-27T15:59:00.1553554Z ok (9.795s) 2022-09-27T15:59:00.1553767Z 2022-09-27T15:59:00.1554159Z ---------------------------------------------------------------------- 2022-09-27T15:59:00.1554527Z Ran 1 test in 9.796s 2022-09-27T15:59:00.1554691Z 2022-09-27T15:59:00.1554784Z OK 2022-09-27T15:59:00.1554917Z 2022-09-27T15:59:00.1555060Z Generating XML reports... 2022-09-27T15:59:00.1589785Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155850.xml 2022-09-27T15:59:02.1029540Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:02.1030056Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:02.1032656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:02.1033144Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:02.3353894Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbcozh3p7 2022-09-27T15:59:02.3355174Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbcozh3p7/_remote_module_non_scriptable.py 2022-09-27T15:59:02.7684921Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:59:02.7700494Z 2022-09-27T15:59:02.7700796Z Running tests... 2022-09-27T15:59:02.7701233Z ---------------------------------------------------------------------- 2022-09-27T15:59:04.2309843Z test_device_map_gpu_mixed_self_1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:59:04.2489703Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17050 2022-09-27T15:59:04.2496083Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17051 2022-09-27T15:59:04.2503560Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 17052 2022-09-27T15:59:04.2509899Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 17053 2022-09-27T15:59:05.8369992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:05.8370792Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:05.8371702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:05.8372182Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:05.8448521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:05.8448965Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:05.8452044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:05.8452523Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:05.8648142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:05.8648592Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:05.8652631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:05.8653106Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:05.8758647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:05.8759092Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:05.8762726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:05.8763200Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:06.0821455Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp14fpnmvl 2022-09-27T15:59:06.0822074Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp14fpnmvl/_remote_module_non_scriptable.py 2022-09-27T15:59:06.0877128Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjooajr4d 2022-09-27T15:59:06.0879896Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjooajr4d/_remote_module_non_scriptable.py 2022-09-27T15:59:06.0927982Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvdj22_1f 2022-09-27T15:59:06.0931045Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvdj22_1f/_remote_module_non_scriptable.py 2022-09-27T15:59:06.0997984Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8u32cg4c 2022-09-27T15:59:06.1000585Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8u32cg4c/_remote_module_non_scriptable.py 2022-09-27T15:59:06.5326977Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:59:06.5356291Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:59:06.5361269Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:59:06.5502248Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:59:06.6542869Z fi_getinfo: -61 2022-09-27T15:59:06.6569230Z fi_getinfo: -61 2022-09-27T15:59:06.6575924Z fi_getinfo: -61 2022-09-27T15:59:06.6715020Z fi_getinfo: -61 2022-09-27T15:59:12.4713672Z ok (9.701s) 2022-09-27T15:59:12.4713889Z 2022-09-27T15:59:12.4714627Z ---------------------------------------------------------------------- 2022-09-27T15:59:12.4715000Z Ran 1 test in 9.701s 2022-09-27T15:59:12.4715167Z 2022-09-27T15:59:12.4715262Z OK 2022-09-27T15:59:12.4716033Z 2022-09-27T15:59:12.4716612Z Generating XML reports... 2022-09-27T15:59:12.4749925Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155902.xml 2022-09-27T15:59:14.4146890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:14.4147451Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:14.4148499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:14.4149027Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:14.6490874Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp93897x3u 2022-09-27T15:59:14.6493010Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp93897x3u/_remote_module_non_scriptable.py 2022-09-27T15:59:15.0906442Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:59:15.0922176Z 2022-09-27T15:59:15.0922866Z Running tests... 2022-09-27T15:59:15.0923377Z ---------------------------------------------------------------------- 2022-09-27T15:59:16.5617987Z test_device_map_gpu_mixed_self_2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:59:16.5795163Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17401 2022-09-27T15:59:16.5801550Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17402 2022-09-27T15:59:16.5808124Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 17403 2022-09-27T15:59:16.5814490Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 17404 2022-09-27T15:59:18.2099387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:18.2099904Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:18.2100856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:18.2101367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:18.2271381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:18.2272354Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:18.2274874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:18.2275847Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:18.2437448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:18.2438417Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:18.2441489Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:18.2442392Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:18.2454422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:18.2455377Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:18.2459071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:18.2459877Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:18.4663545Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpib3dcmmk 2022-09-27T15:59:18.4664531Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpib3dcmmk/_remote_module_non_scriptable.py 2022-09-27T15:59:18.4727279Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpia41wnob 2022-09-27T15:59:18.4730993Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpia41wnob/_remote_module_non_scriptable.py 2022-09-27T15:59:18.4817492Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmdg4y2fb 2022-09-27T15:59:18.4820000Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmdg4y2fb/_remote_module_non_scriptable.py 2022-09-27T15:59:18.4847149Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv8pzlh_b 2022-09-27T15:59:18.4850034Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv8pzlh_b/_remote_module_non_scriptable.py 2022-09-27T15:59:18.9128298Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:59:18.9173989Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:59:18.9340789Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:59:18.9403469Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:59:19.0345532Z fi_getinfo: -61 2022-09-27T15:59:19.0386319Z fi_getinfo: -61 2022-09-27T15:59:19.0560558Z fi_getinfo: -61 2022-09-27T15:59:19.0618137Z fi_getinfo: -61 2022-09-27T15:59:24.7989030Z ok (9.706s) 2022-09-27T15:59:24.7989428Z 2022-09-27T15:59:24.7990195Z ---------------------------------------------------------------------- 2022-09-27T15:59:24.7990679Z Ran 1 test in 9.707s 2022-09-27T15:59:24.7991108Z 2022-09-27T15:59:24.7991195Z OK 2022-09-27T15:59:24.7991442Z 2022-09-27T15:59:24.7991909Z Generating XML reports... 2022-09-27T15:59:24.8027375Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155915.xml 2022-09-27T15:59:26.7517229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:26.7517753Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:26.7519385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:26.7519869Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:26.9791548Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa5wwx47q 2022-09-27T15:59:26.9792887Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa5wwx47q/_remote_module_non_scriptable.py 2022-09-27T15:59:27.4068836Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:59:27.4084314Z 2022-09-27T15:59:27.4084568Z Running tests... 2022-09-27T15:59:27.4084982Z ---------------------------------------------------------------------- 2022-09-27T15:59:28.8531581Z test_device_map_gpu_mixed_self_3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:59:28.8708429Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17752 2022-09-27T15:59:28.8715005Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17753 2022-09-27T15:59:28.8721618Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 17754 2022-09-27T15:59:28.8728281Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 17755 2022-09-27T15:59:30.4493037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:30.4493549Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:30.4494938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:30.4495408Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:30.4578661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:30.4579137Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:30.4582814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:30.4583300Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:30.4610520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:30.4610971Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:30.4614392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:30.4614850Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:30.4907962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:30.4908427Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:30.4911724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:30.4912188Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:30.6953085Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn2rvlu2_ 2022-09-27T15:59:30.6953675Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpebzfsa1c 2022-09-27T15:59:30.6954536Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn2rvlu2_/_remote_module_non_scriptable.py 2022-09-27T15:59:30.6955747Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpebzfsa1c/_remote_module_non_scriptable.py 2022-09-27T15:59:30.6980438Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmx6zvcnp 2022-09-27T15:59:30.6983401Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmx6zvcnp/_remote_module_non_scriptable.py 2022-09-27T15:59:30.7229289Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfgnpss_r 2022-09-27T15:59:30.7232255Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfgnpss_r/_remote_module_non_scriptable.py 2022-09-27T15:59:31.1369633Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:59:31.1436764Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:59:31.1451775Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:59:31.1743551Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:59:31.2585821Z fi_getinfo: -61 2022-09-27T15:59:31.2650139Z fi_getinfo: -61 2022-09-27T15:59:31.2663749Z fi_getinfo: -61 2022-09-27T15:59:31.2959967Z fi_getinfo: -61 2022-09-27T15:59:37.0935964Z ok (9.685s) 2022-09-27T15:59:37.0936185Z 2022-09-27T15:59:37.0937873Z ---------------------------------------------------------------------- 2022-09-27T15:59:37.0938263Z Ran 1 test in 9.685s 2022-09-27T15:59:37.0938443Z 2022-09-27T15:59:37.0938538Z OK 2022-09-27T15:59:37.0938671Z 2022-09-27T15:59:37.0938811Z Generating XML reports... 2022-09-27T15:59:37.0975157Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155927.xml 2022-09-27T15:59:39.0774644Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:39.0775152Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:39.0777704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:39.0778179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:39.3068673Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9zklgz6c 2022-09-27T15:59:39.3069961Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9zklgz6c/_remote_module_non_scriptable.py 2022-09-27T15:59:39.7292544Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:59:39.7306929Z 2022-09-27T15:59:39.7307375Z Running tests... 2022-09-27T15:59:39.7307882Z ---------------------------------------------------------------------- 2022-09-27T15:59:41.1901155Z test_device_map_gpu_mixed_self_4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:59:41.2094378Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18103 2022-09-27T15:59:41.2101415Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18104 2022-09-27T15:59:41.2107814Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 18105 2022-09-27T15:59:41.2115144Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 18106 2022-09-27T15:59:42.8175202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:42.8176136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:42.8177331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:42.8177802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:42.8237954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:42.8238413Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:42.8241563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:42.8242040Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:42.8254219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:42.8254691Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:42.8258182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:42.8258652Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:42.8915273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:42.8915791Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:42.8918849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:42.8919327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:43.0576321Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbrw58ow4 2022-09-27T15:59:43.0577147Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbrw58ow4/_remote_module_non_scriptable.py 2022-09-27T15:59:43.0583042Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptmhun2fq 2022-09-27T15:59:43.0585767Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptmhun2fq/_remote_module_non_scriptable.py 2022-09-27T15:59:43.0631861Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpia6sy6xk 2022-09-27T15:59:43.0635490Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpia6sy6xk/_remote_module_non_scriptable.py 2022-09-27T15:59:43.1197269Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp75e1zfha 2022-09-27T15:59:43.1199758Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp75e1zfha/_remote_module_non_scriptable.py 2022-09-27T15:59:43.4993450Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:59:43.5061978Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:59:43.5080900Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:59:43.5719917Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:59:43.6210361Z fi_getinfo: -61 2022-09-27T15:59:43.6273958Z fi_getinfo: -61 2022-09-27T15:59:43.6292141Z fi_getinfo: -61 2022-09-27T15:59:43.6934699Z fi_getinfo: -61 2022-09-27T15:59:49.6296380Z ok (9.899s) 2022-09-27T15:59:49.6296747Z 2022-09-27T15:59:49.6297351Z ---------------------------------------------------------------------- 2022-09-27T15:59:49.6297948Z Ran 1 test in 9.899s 2022-09-27T15:59:49.6298230Z 2022-09-27T15:59:49.6298391Z OK 2022-09-27T15:59:49.6298638Z 2022-09-27T15:59:49.6298868Z Generating XML reports... 2022-09-27T15:59:49.6338420Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155939.xml 2022-09-27T15:59:51.6149881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:51.6150392Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:51.6153928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:51.6154400Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:51.8441064Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4w5_pzw_ 2022-09-27T15:59:51.8442627Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4w5_pzw_/_remote_module_non_scriptable.py 2022-09-27T15:59:52.2724179Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T15:59:52.2739622Z 2022-09-27T15:59:52.2739769Z Running tests... 2022-09-27T15:59:52.2740448Z ---------------------------------------------------------------------- 2022-09-27T15:59:53.7100387Z test_device_map_gpu_mixed_self_5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T15:59:53.7279499Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18454 2022-09-27T15:59:53.7286201Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18455 2022-09-27T15:59:53.7292774Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 18456 2022-09-27T15:59:53.7299251Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 18457 2022-09-27T15:59:55.3152602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:55.3153115Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:55.3155733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:55.3156229Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:55.3620232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:55.3620713Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:55.3624143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:55.3624621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:55.3964294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:55.3964754Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:55.3968187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:55.3968672Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:55.4377722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T15:59:55.4378233Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T15:59:55.4379482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T15:59:55.4379949Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T15:59:55.5521330Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc857ox32 2022-09-27T15:59:55.5523268Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc857ox32/_remote_module_non_scriptable.py 2022-09-27T15:59:55.5851934Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppynnahv0 2022-09-27T15:59:55.5854828Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppynnahv0/_remote_module_non_scriptable.py 2022-09-27T15:59:55.6163031Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe_kx2b_u 2022-09-27T15:59:55.6165528Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe_kx2b_u/_remote_module_non_scriptable.py 2022-09-27T15:59:55.6632224Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpaixf_6q5 2022-09-27T15:59:55.6634823Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpaixf_6q5/_remote_module_non_scriptable.py 2022-09-27T15:59:55.9854949Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T15:59:56.0184342Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T15:59:56.0523114Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T15:59:56.1064739Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T15:59:56.1069780Z fi_getinfo: -61 2022-09-27T15:59:56.1400931Z fi_getinfo: -61 2022-09-27T15:59:56.1737272Z fi_getinfo: -61 2022-09-27T15:59:56.2280786Z fi_getinfo: -61 2022-09-27T16:00:02.0476484Z ok (9.773s) 2022-09-27T16:00:02.0476713Z 2022-09-27T16:00:02.0477144Z ---------------------------------------------------------------------- 2022-09-27T16:00:02.0477488Z Ran 1 test in 9.774s 2022-09-27T16:00:02.0477654Z 2022-09-27T16:00:02.0477814Z OK 2022-09-27T16:00:02.0478032Z 2022-09-27T16:00:02.0478259Z Generating XML reports... 2022-09-27T16:00:02.0514355Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155952.xml 2022-09-27T16:00:04.0524324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:04.0524865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:04.0526470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:04.0526956Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:04.2880592Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzib7lzob 2022-09-27T16:00:04.2881993Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzib7lzob/_remote_module_non_scriptable.py 2022-09-27T16:00:04.7255400Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:00:04.7271092Z 2022-09-27T16:00:04.7271343Z Running tests... 2022-09-27T16:00:04.7271763Z ---------------------------------------------------------------------- 2022-09-27T16:00:06.2207942Z test_device_map_gpu_mixed_self_6 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:00:06.2385309Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18805 2022-09-27T16:00:06.2392119Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18806 2022-09-27T16:00:06.2399485Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 18807 2022-09-27T16:00:06.2406122Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 18808 2022-09-27T16:00:07.8421204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:07.8421711Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:07.8422519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:07.8423004Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:07.8482868Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:07.8483626Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:07.8486524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:07.8487015Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:07.8588098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:07.8588556Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:07.8592164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:07.8592643Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:07.8680761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:07.8681222Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:07.8684890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:07.8685352Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:08.0833418Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplvwsbvqe 2022-09-27T16:00:08.0836815Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplvwsbvqe/_remote_module_non_scriptable.py 2022-09-27T16:00:08.0890340Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf0yumhzt 2022-09-27T16:00:08.0893015Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf0yumhzt/_remote_module_non_scriptable.py 2022-09-27T16:00:08.0968334Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplr_ts_hg 2022-09-27T16:00:08.0971131Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplr_ts_hg/_remote_module_non_scriptable.py 2022-09-27T16:00:08.1049541Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpicxu97g6 2022-09-27T16:00:08.1052656Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpicxu97g6/_remote_module_non_scriptable.py 2022-09-27T16:00:08.5339048Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:00:08.5353008Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:00:08.5413922Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:00:08.5587037Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:00:08.6556935Z fi_getinfo: -61 2022-09-27T16:00:08.6567096Z fi_getinfo: -61 2022-09-27T16:00:08.6624432Z fi_getinfo: -61 2022-09-27T16:00:08.6798773Z fi_getinfo: -61 2022-09-27T16:00:14.5586908Z ok (9.831s) 2022-09-27T16:00:14.5587345Z 2022-09-27T16:00:14.5588071Z ---------------------------------------------------------------------- 2022-09-27T16:00:14.5588415Z Ran 1 test in 9.831s 2022-09-27T16:00:14.5588580Z 2022-09-27T16:00:14.5588673Z OK 2022-09-27T16:00:14.5589092Z 2022-09-27T16:00:14.5589247Z Generating XML reports... 2022-09-27T16:00:14.5624179Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160004.xml 2022-09-27T16:00:16.5471920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:16.5472416Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:16.5475335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:16.5475824Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:16.7838166Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqopudtpy 2022-09-27T16:00:16.7839637Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqopudtpy/_remote_module_non_scriptable.py 2022-09-27T16:00:17.2246184Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:00:17.2261497Z 2022-09-27T16:00:17.2261874Z Running tests... 2022-09-27T16:00:17.2262355Z ---------------------------------------------------------------------- 2022-09-27T16:00:18.7225539Z test_device_map_gpu_mixed_self_7 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:00:18.7411311Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19156 2022-09-27T16:00:18.7417689Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19157 2022-09-27T16:00:18.7423948Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 19158 2022-09-27T16:00:18.7430432Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 19159 2022-09-27T16:00:20.3305793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:20.3306804Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:20.3307974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:20.3308903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:20.3336802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:20.3337290Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:20.3340343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:20.3340811Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:20.3617041Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:20.3617970Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:20.3620177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:20.3621122Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:20.4165756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:20.4166719Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:20.4167953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:20.4168762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:20.5622763Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwi0cddhs 2022-09-27T16:00:20.5624125Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwi0cddhs/_remote_module_non_scriptable.py 2022-09-27T16:00:20.5779064Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz1feo9wh 2022-09-27T16:00:20.5781115Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz1feo9wh/_remote_module_non_scriptable.py 2022-09-27T16:00:20.5821187Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5spji3dp 2022-09-27T16:00:20.5823187Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5spji3dp/_remote_module_non_scriptable.py 2022-09-27T16:00:20.6437552Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9_hkg_m6 2022-09-27T16:00:20.6439227Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9_hkg_m6/_remote_module_non_scriptable.py 2022-09-27T16:00:21.0077917Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:00:21.0178541Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:00:21.0222121Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:00:21.0931099Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:00:21.1291485Z fi_getinfo: -61 2022-09-27T16:00:21.1391027Z fi_getinfo: -61 2022-09-27T16:00:21.1433641Z fi_getinfo: -61 2022-09-27T16:00:21.2147205Z fi_getinfo: -61 2022-09-27T16:00:26.9612606Z ok (9.735s) 2022-09-27T16:00:26.9612823Z 2022-09-27T16:00:26.9613229Z ---------------------------------------------------------------------- 2022-09-27T16:00:26.9613570Z Ran 1 test in 9.735s 2022-09-27T16:00:26.9613716Z 2022-09-27T16:00:26.9613805Z OK 2022-09-27T16:00:26.9613940Z 2022-09-27T16:00:26.9614074Z Generating XML reports... 2022-09-27T16:00:26.9649316Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160017.xml 2022-09-27T16:00:28.8973693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:28.8974516Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:28.8975354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:28.8975841Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:29.1253428Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplu7bpsdy 2022-09-27T16:00:29.1254477Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplu7bpsdy/_remote_module_non_scriptable.py 2022-09-27T16:00:29.5554707Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:00:29.5569530Z 2022-09-27T16:00:29.5570050Z Running tests... 2022-09-27T16:00:29.5570558Z ---------------------------------------------------------------------- 2022-09-27T16:00:30.9923011Z test_device_map_gpu_mixed_self_8 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:00:31.0097984Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19507 2022-09-27T16:00:31.0104397Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19508 2022-09-27T16:00:31.0110492Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 19509 2022-09-27T16:00:31.0117999Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 19510 2022-09-27T16:00:32.5952627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:32.5953605Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:32.5954801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:32.5956094Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:32.6212315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:32.6213240Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:32.6216454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:32.6217428Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:32.6588084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:32.6589025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:32.6592084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:32.6593370Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:32.7012037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:32.7012896Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:32.7014265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:32.7015072Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:32.8355514Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2v9cx_1k 2022-09-27T16:00:32.8356884Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2v9cx_1k/_remote_module_non_scriptable.py 2022-09-27T16:00:32.8437885Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4_u3ta3s 2022-09-27T16:00:32.8440813Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4_u3ta3s/_remote_module_non_scriptable.py 2022-09-27T16:00:32.8837196Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyyd_ozlq 2022-09-27T16:00:32.8840183Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyyd_ozlq/_remote_module_non_scriptable.py 2022-09-27T16:00:32.9280784Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk_xkde8v 2022-09-27T16:00:32.9283468Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk_xkde8v/_remote_module_non_scriptable.py 2022-09-27T16:00:33.2709735Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:00:33.2770795Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:00:33.3240004Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:00:33.3697448Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:00:33.3925305Z fi_getinfo: -61 2022-09-27T16:00:33.3982732Z fi_getinfo: -61 2022-09-27T16:00:33.4457146Z fi_getinfo: -61 2022-09-27T16:00:33.4911530Z fi_getinfo: -61 2022-09-27T16:00:39.3298905Z ok (9.773s) 2022-09-27T16:00:39.3299128Z 2022-09-27T16:00:39.3299563Z ---------------------------------------------------------------------- 2022-09-27T16:00:39.3299894Z Ran 1 test in 9.773s 2022-09-27T16:00:39.3300066Z 2022-09-27T16:00:39.3300165Z OK 2022-09-27T16:00:39.3300308Z 2022-09-27T16:00:39.3300447Z Generating XML reports... 2022-09-27T16:00:39.3339144Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160029.xml 2022-09-27T16:00:41.3094126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:41.3094865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:41.3095899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:41.3096853Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:41.5497731Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptqaqfgd4 2022-09-27T16:00:41.5498811Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptqaqfgd4/_remote_module_non_scriptable.py 2022-09-27T16:00:41.9925619Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:00:41.9941321Z 2022-09-27T16:00:41.9941584Z Running tests... 2022-09-27T16:00:41.9942022Z ---------------------------------------------------------------------- 2022-09-27T16:00:43.4863782Z test_device_map_gpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:00:43.5048144Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19858 2022-09-27T16:00:43.5054758Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19859 2022-09-27T16:00:43.5060897Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 19860 2022-09-27T16:00:43.5067501Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 19861 2022-09-27T16:00:45.1139937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:45.1140730Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:45.1141860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:45.1142333Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:45.1147574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:45.1148070Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:45.1152206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:45.1152689Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:45.1369568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:45.1370034Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:45.1373738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:45.1374215Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:45.1522592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:45.1523056Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:45.1526880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:45.1527361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:45.3445057Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp64yuenw1 2022-09-27T16:00:45.3446162Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp64yuenw1/_remote_module_non_scriptable.py 2022-09-27T16:00:45.3715390Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq9b5lbgv 2022-09-27T16:00:45.3718081Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq9b5lbgv/_remote_module_non_scriptable.py 2022-09-27T16:00:45.3731803Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx_4pmyz2 2022-09-27T16:00:45.3734568Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx_4pmyz2/_remote_module_non_scriptable.py 2022-09-27T16:00:45.3739951Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqub0j8do 2022-09-27T16:00:45.3743155Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqub0j8do/_remote_module_non_scriptable.py 2022-09-27T16:00:45.7914897Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:00:45.8157667Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:00:45.8226664Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:00:45.8254946Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:00:45.9131766Z fi_getinfo: -61 2022-09-27T16:00:45.9370014Z fi_getinfo: -61 2022-09-27T16:00:45.9440224Z fi_getinfo: -61 2022-09-27T16:00:45.9470341Z fi_getinfo: -61 2022-09-27T16:00:49.4208505Z ok (7.426s) 2022-09-27T16:00:49.4209234Z 2022-09-27T16:00:49.4209684Z ---------------------------------------------------------------------- 2022-09-27T16:00:49.4210034Z Ran 1 test in 7.427s 2022-09-27T16:00:49.4210182Z 2022-09-27T16:00:49.4210287Z OK 2022-09-27T16:00:49.4210423Z 2022-09-27T16:00:49.4210558Z Generating XML reports... 2022-09-27T16:00:49.4247102Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160041.xml 2022-09-27T16:00:51.4213079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:51.4213618Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:51.4214898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:51.4215388Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:51.6642361Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2opwxxrf 2022-09-27T16:00:51.6643832Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2opwxxrf/_remote_module_non_scriptable.py 2022-09-27T16:00:52.1063277Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:00:52.1079002Z 2022-09-27T16:00:52.1079427Z Running tests... 2022-09-27T16:00:52.1079935Z ---------------------------------------------------------------------- 2022-09-27T16:00:53.5956561Z test_device_map_gpu_non_default_to_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:00:53.6140087Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20205 2022-09-27T16:00:53.6146138Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20206 2022-09-27T16:00:53.6152564Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 20207 2022-09-27T16:00:53.6160554Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 20208 2022-09-27T16:00:55.2124712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:55.2125223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:55.2126259Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:55.2126718Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:55.2222926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:55.2223387Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:55.2226781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:55.2227255Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:55.2756991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:55.2757758Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:55.2760492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:55.2760963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:55.2939766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:00:55.2940491Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:00:55.2942953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:00:55.2943436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:00:55.4497997Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3gjeht6t 2022-09-27T16:00:55.4498860Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3gjeht6t/_remote_module_non_scriptable.py 2022-09-27T16:00:55.4534365Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl8vlrwyv 2022-09-27T16:00:55.4537127Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl8vlrwyv/_remote_module_non_scriptable.py 2022-09-27T16:00:55.5052901Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdc7ylchj 2022-09-27T16:00:55.5055343Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdc7ylchj/_remote_module_non_scriptable.py 2022-09-27T16:00:55.5224489Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdev5goy5 2022-09-27T16:00:55.5227195Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdev5goy5/_remote_module_non_scriptable.py 2022-09-27T16:00:55.8928159Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:00:55.8929113Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:00:55.9447249Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:00:55.9770860Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:00:56.0150665Z fi_getinfo: -61 2022-09-27T16:00:56.0156593Z fi_getinfo: -61 2022-09-27T16:00:56.0664340Z fi_getinfo: -61 2022-09-27T16:00:56.0982519Z fi_getinfo: -61 2022-09-27T16:01:01.9343824Z ok (9.826s) 2022-09-27T16:01:01.9344087Z 2022-09-27T16:01:01.9344489Z ---------------------------------------------------------------------- 2022-09-27T16:01:01.9344830Z Ran 1 test in 9.826s 2022-09-27T16:01:01.9344997Z 2022-09-27T16:01:01.9345101Z OK 2022-09-27T16:01:01.9345217Z 2022-09-27T16:01:01.9345354Z Generating XML reports... 2022-09-27T16:01:01.9380729Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160052.xml 2022-09-27T16:01:03.9380312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:03.9381979Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:03.9382590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:03.9383069Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:04.1769570Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgqcx2wv5 2022-09-27T16:01:04.1770986Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgqcx2wv5/_remote_module_non_scriptable.py 2022-09-27T16:01:04.6257936Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:01:04.6273752Z 2022-09-27T16:01:04.6274218Z Running tests... 2022-09-27T16:01:04.6274704Z ---------------------------------------------------------------------- 2022-09-27T16:01:06.1140271Z test_device_map_gpu_to_cpu_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:01:06.1323632Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20564 2022-09-27T16:01:06.1330229Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20565 2022-09-27T16:01:06.1336803Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 20566 2022-09-27T16:01:06.1343814Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 20567 2022-09-27T16:01:07.7635009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:07.7635536Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:07.7638160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:07.7638635Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:07.7898354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:07.7898829Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:07.7901502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:07.7901966Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:07.7939488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:07.7939953Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:07.7943758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:07.7944223Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:07.8104191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:07.8104650Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:07.8107979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:07.8108443Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:08.0183239Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7_j5qsu7 2022-09-27T16:01:08.0184329Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7_j5qsu7/_remote_module_non_scriptable.py 2022-09-27T16:01:08.0189131Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpta2no2_u 2022-09-27T16:01:08.0192228Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpta2no2_u/_remote_module_non_scriptable.py 2022-09-27T16:01:08.0237170Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_ba7jwpg 2022-09-27T16:01:08.0239751Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_ba7jwpg/_remote_module_non_scriptable.py 2022-09-27T16:01:08.0331418Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr5qsvi4t 2022-09-27T16:01:08.0334027Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr5qsvi4t/_remote_module_non_scriptable.py 2022-09-27T16:01:08.4665683Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:01:08.4666189Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:01:08.4679700Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:01:08.4754977Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:01:08.5893807Z fi_getinfo: -61 2022-09-27T16:01:08.5897505Z fi_getinfo: -61 2022-09-27T16:01:08.5900895Z fi_getinfo: -61 2022-09-27T16:01:08.5965054Z fi_getinfo: -61 2022-09-27T16:01:12.2477829Z ok (7.620s) 2022-09-27T16:01:12.2478177Z 2022-09-27T16:01:12.2478646Z ---------------------------------------------------------------------- 2022-09-27T16:01:12.2478992Z Ran 1 test in 7.620s 2022-09-27T16:01:12.2479155Z 2022-09-27T16:01:12.2479246Z OK 2022-09-27T16:01:12.2479362Z 2022-09-27T16:01:12.2479494Z Generating XML reports... 2022-09-27T16:01:12.2514590Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160104.xml 2022-09-27T16:01:14.1881957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:14.1882776Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:14.1884616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:14.1885085Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:14.4286493Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqt4uw06u 2022-09-27T16:01:14.4287821Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqt4uw06u/_remote_module_non_scriptable.py 2022-09-27T16:01:14.8686632Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:01:14.8702380Z 2022-09-27T16:01:14.8702653Z Running tests... 2022-09-27T16:01:14.8703416Z ---------------------------------------------------------------------- 2022-09-27T16:01:16.3549105Z test_device_map_gpu_to_cpu_non_default (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:01:16.3833283Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20915 2022-09-27T16:01:16.3839489Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20916 2022-09-27T16:01:16.3846082Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 20917 2022-09-27T16:01:16.3853158Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 20918 2022-09-27T16:01:18.0143656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:18.0144150Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:18.0146314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:18.0146806Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:18.0160685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:18.0161135Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:18.0164534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:18.0164997Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:18.0401600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:18.0402073Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:18.0405912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:18.0406395Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:18.0440313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:18.0440774Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:18.0444937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:18.0445432Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:18.2461787Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1_tq_3ny 2022-09-27T16:01:18.2462802Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1_tq_3ny/_remote_module_non_scriptable.py 2022-09-27T16:01:18.2673405Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjuuzixxj 2022-09-27T16:01:18.2676519Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjuuzixxj/_remote_module_non_scriptable.py 2022-09-27T16:01:18.2692997Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph9h7vvmj 2022-09-27T16:01:18.2696060Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph9h7vvmj/_remote_module_non_scriptable.py 2022-09-27T16:01:18.2718933Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkz3kvmt6 2022-09-27T16:01:18.2721702Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkz3kvmt6/_remote_module_non_scriptable.py 2022-09-27T16:01:18.6934789Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:01:18.7168194Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:01:18.7202076Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:01:18.7235159Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:01:18.8152241Z fi_getinfo: -61 2022-09-27T16:01:18.8383958Z fi_getinfo: -61 2022-09-27T16:01:18.8415998Z fi_getinfo: -61 2022-09-27T16:01:18.8451141Z fi_getinfo: -61 2022-09-27T16:01:22.3998213Z ok (7.529s) 2022-09-27T16:01:22.3998429Z 2022-09-27T16:01:22.3998846Z ---------------------------------------------------------------------- 2022-09-27T16:01:22.3999210Z Ran 1 test in 7.529s 2022-09-27T16:01:22.3999357Z 2022-09-27T16:01:22.3999452Z OK 2022-09-27T16:01:22.3999590Z 2022-09-27T16:01:22.3999723Z Generating XML reports... 2022-09-27T16:01:22.4036481Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160114.xml 2022-09-27T16:01:24.3759531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:24.3760037Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:24.3761440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:24.3761903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:24.6031994Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps00h7tns 2022-09-27T16:01:24.6035375Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps00h7tns/_remote_module_non_scriptable.py 2022-09-27T16:01:25.0297620Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:01:25.0312591Z 2022-09-27T16:01:25.0313310Z Running tests... 2022-09-27T16:01:25.0313837Z ---------------------------------------------------------------------- 2022-09-27T16:01:26.4852104Z test_device_maps_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:01:26.5029059Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21266 2022-09-27T16:01:26.5035563Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21267 2022-09-27T16:01:26.5041728Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 21268 2022-09-27T16:01:26.5048868Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 21269 2022-09-27T16:01:28.1291756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:28.1292310Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:28.1292878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:28.1293631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:28.1294322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:28.1294796Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:28.1295366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:28.1295986Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:28.1546615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:28.1547328Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:28.1550380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:28.1551251Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:28.1792273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:28.1792883Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:28.1796883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:28.1797429Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:28.3959752Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9ho4ga75 2022-09-27T16:01:28.3960391Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9ho4ga75/_remote_module_non_scriptable.py 2022-09-27T16:01:28.3975920Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptoo859q9 2022-09-27T16:01:28.3978809Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptoo859q9/_remote_module_non_scriptable.py 2022-09-27T16:01:28.4001345Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjwcburru 2022-09-27T16:01:28.4004213Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjwcburru/_remote_module_non_scriptable.py 2022-09-27T16:01:28.4148300Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3ik5e3vu 2022-09-27T16:01:28.4151672Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3ik5e3vu/_remote_module_non_scriptable.py 2022-09-27T16:01:28.8400853Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:01:28.8445164Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:01:28.8479380Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:01:28.8633203Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:01:28.9616499Z fi_getinfo: -61 2022-09-27T16:01:28.9657464Z fi_getinfo: -61 2022-09-27T16:01:28.9691365Z fi_getinfo: -61 2022-09-27T16:01:28.9846365Z fi_getinfo: -61 2022-09-27T16:01:34.8235086Z ok (9.792s) 2022-09-27T16:01:34.8235303Z 2022-09-27T16:01:34.8235682Z ---------------------------------------------------------------------- 2022-09-27T16:01:34.8236050Z Ran 1 test in 9.792s 2022-09-27T16:01:34.8236216Z 2022-09-27T16:01:34.8236315Z OK 2022-09-27T16:01:34.8236448Z 2022-09-27T16:01:34.8236602Z Generating XML reports... 2022-09-27T16:01:34.8272880Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160125.xml 2022-09-27T16:01:36.8074796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:36.8075828Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:36.8077028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:36.8077974Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:37.0436628Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprgx9av_q 2022-09-27T16:01:37.0437940Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprgx9av_q/_remote_module_non_scriptable.py 2022-09-27T16:01:37.4858368Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:01:37.4875310Z 2022-09-27T16:01:37.4875569Z Running tests... 2022-09-27T16:01:37.4876004Z ---------------------------------------------------------------------- 2022-09-27T16:01:38.9600218Z test_device_maps_in_options (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:01:38.9775100Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21625 2022-09-27T16:01:38.9781177Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21626 2022-09-27T16:01:38.9787317Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 21627 2022-09-27T16:01:38.9794029Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 21628 2022-09-27T16:01:40.6479332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:40.6480343Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:40.6481549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:40.6482506Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:40.6534843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:40.6535791Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:40.6539308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:40.6540235Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:40.6733917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:40.6734860Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:40.6737023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:40.6737992Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:40.7351665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:40.7352625Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:40.7353740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:40.7354648Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:40.8908776Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1pmp3biv 2022-09-27T16:01:40.8909960Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1pmp3biv/_remote_module_non_scriptable.py 2022-09-27T16:01:40.8933373Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprhrw78cc 2022-09-27T16:01:40.8936610Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprhrw78cc/_remote_module_non_scriptable.py 2022-09-27T16:01:40.8966882Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4e72szil 2022-09-27T16:01:40.8968661Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4e72szil/_remote_module_non_scriptable.py 2022-09-27T16:01:40.9532407Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpifqjxfhp 2022-09-27T16:01:40.9533773Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpifqjxfhp/_remote_module_non_scriptable.py 2022-09-27T16:01:41.3329059Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:01:41.3369817Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:01:41.3481273Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:01:41.3912328Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:01:41.4543333Z fi_getinfo: -61 2022-09-27T16:01:41.4582797Z fi_getinfo: -61 2022-09-27T16:01:41.4695398Z fi_getinfo: -61 2022-09-27T16:01:41.5125865Z fi_getinfo: -61 2022-09-27T16:01:47.2979409Z ok (9.810s) 2022-09-27T16:01:47.2979670Z 2022-09-27T16:01:47.2980068Z ---------------------------------------------------------------------- 2022-09-27T16:01:47.2980409Z Ran 1 test in 9.810s 2022-09-27T16:01:47.2980574Z 2022-09-27T16:01:47.2980672Z OK 2022-09-27T16:01:47.2980789Z 2022-09-27T16:01:47.2980919Z Generating XML reports... 2022-09-27T16:01:47.3017147Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160137.xml 2022-09-27T16:01:49.2724499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:49.2725227Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:49.2726807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:49.2727288Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:49.5065541Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3h29hv97 2022-09-27T16:01:49.5066523Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3h29hv97/_remote_module_non_scriptable.py 2022-09-27T16:01:49.9416747Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:01:49.9433174Z 2022-09-27T16:01:49.9433599Z Running tests... 2022-09-27T16:01:49.9434099Z ---------------------------------------------------------------------- 2022-09-27T16:01:51.4322509Z test_device_maps_invalid_max_local_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:01:51.4499930Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21984 2022-09-27T16:01:51.4506244Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21985 2022-09-27T16:01:51.4512917Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 21986 2022-09-27T16:01:51.4520486Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 21987 2022-09-27T16:01:53.0360002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:53.0360506Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:53.0362895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:53.0363381Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:53.0433521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:53.0433960Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:53.0438449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:53.0438953Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:53.0855470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:53.0855932Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:53.0859120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:53.0859627Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:53.1384737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:53.1385484Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:53.1387182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:53.1387702Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:53.2699194Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1_ru00zy 2022-09-27T16:01:53.2700222Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1_ru00zy/_remote_module_non_scriptable.py 2022-09-27T16:01:53.2773445Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz2cp2szk 2022-09-27T16:01:53.2776314Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz2cp2szk/_remote_module_non_scriptable.py 2022-09-27T16:01:53.3051860Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy_1i18mn 2022-09-27T16:01:53.3053375Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy_1i18mn/_remote_module_non_scriptable.py 2022-09-27T16:01:53.3647573Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps1qs8j7z 2022-09-27T16:01:53.3649882Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps1qs8j7z/_remote_module_non_scriptable.py 2022-09-27T16:01:53.7141896Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:01:53.7207477Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:01:53.7399406Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:01:53.8125631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:01:53.8357495Z fi_getinfo: -61 2022-09-27T16:01:53.8421708Z fi_getinfo: -61 2022-09-27T16:01:53.8610748Z fi_getinfo: -61 2022-09-27T16:01:53.9341216Z fi_getinfo: -61 2022-09-27T16:01:54.4604073Z ok (4.517s) 2022-09-27T16:01:54.4604340Z 2022-09-27T16:01:54.4604941Z ---------------------------------------------------------------------- 2022-09-27T16:01:54.4605317Z Ran 1 test in 4.517s 2022-09-27T16:01:54.4605483Z 2022-09-27T16:01:54.4605585Z OK 2022-09-27T16:01:54.4605718Z 2022-09-27T16:01:54.4605865Z Generating XML reports... 2022-09-27T16:01:54.4641393Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160149.xml 2022-09-27T16:01:56.4045154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:01:56.4045835Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:01:56.4047338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:01:56.4047853Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:01:56.6348502Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0j87zlji 2022-09-27T16:01:56.6350002Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0j87zlji/_remote_module_non_scriptable.py 2022-09-27T16:01:57.0637595Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:01:57.0652444Z 2022-09-27T16:01:57.0652720Z Running tests... 2022-09-27T16:01:57.0653442Z ---------------------------------------------------------------------- 2022-09-27T16:01:58.5251996Z test_device_maps_invalid_max_remote_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:01:58.5428873Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22171 2022-09-27T16:01:58.5436024Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22172 2022-09-27T16:01:58.5442329Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 22173 2022-09-27T16:01:58.5448643Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 22174 2022-09-27T16:02:00.1177885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:00.1178384Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:00.1179626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:00.1180102Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:00.1387721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:00.1388162Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:00.1391696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:00.1392188Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:00.1485313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:00.1485760Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:00.1489201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:00.1489678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:00.1562536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:00.1562972Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:00.1566717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:00.1567196Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:00.3781459Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpty3c1awo 2022-09-27T16:02:00.3782735Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpty3c1awo/_remote_module_non_scriptable.py 2022-09-27T16:02:00.3795699Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpugpeqfpp 2022-09-27T16:02:00.3798244Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpugpeqfpp/_remote_module_non_scriptable.py 2022-09-27T16:02:00.3827566Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp118awfrx 2022-09-27T16:02:00.3830315Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp118awfrx/_remote_module_non_scriptable.py 2022-09-27T16:02:00.3925512Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi7pl9z5l 2022-09-27T16:02:00.3928061Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi7pl9z5l/_remote_module_non_scriptable.py 2022-09-27T16:02:00.8223836Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:02:00.8305912Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:02:00.8306422Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:02:00.8504972Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:02:00.9440012Z fi_getinfo: -61 2022-09-27T16:02:00.9526951Z fi_getinfo: -61 2022-09-27T16:02:00.9530906Z fi_getinfo: -61 2022-09-27T16:02:00.9717980Z fi_getinfo: -61 2022-09-27T16:02:01.4521915Z ok (4.387s) 2022-09-27T16:02:01.4522152Z 2022-09-27T16:02:01.4522545Z ---------------------------------------------------------------------- 2022-09-27T16:02:01.4522867Z Ran 1 test in 4.387s 2022-09-27T16:02:01.4523037Z 2022-09-27T16:02:01.4523130Z OK 2022-09-27T16:02:01.4523574Z 2022-09-27T16:02:01.4523710Z Generating XML reports... 2022-09-27T16:02:01.4559238Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160157.xml 2022-09-27T16:02:03.4412303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:03.4412822Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:03.4414275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:03.4414732Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:03.6756399Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0khfi9f_ 2022-09-27T16:02:03.6757392Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0khfi9f_/_remote_module_non_scriptable.py 2022-09-27T16:02:04.1068105Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:02:04.1083020Z 2022-09-27T16:02:04.1083541Z Running tests... 2022-09-27T16:02:04.1084210Z ---------------------------------------------------------------------- 2022-09-27T16:02:05.5764626Z test_device_maps_invalid_min_device (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:02:05.5941288Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22358 2022-09-27T16:02:05.5947873Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22359 2022-09-27T16:02:05.5954351Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 22360 2022-09-27T16:02:05.5961119Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 22361 2022-09-27T16:02:07.1807211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:07.1807748Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:07.1808804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:07.1809263Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:07.1830495Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:07.1831187Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:07.1834790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:07.1835265Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:07.1867060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:07.1867515Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:07.1870947Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:07.1871884Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:07.2031975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:07.2032435Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:07.2036129Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:07.2036607Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:07.4178455Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpta6h13gn 2022-09-27T16:02:07.4179035Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpta6h13gn/_remote_module_non_scriptable.py 2022-09-27T16:02:07.4223242Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1lbd64is 2022-09-27T16:02:07.4225904Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1lbd64is/_remote_module_non_scriptable.py 2022-09-27T16:02:07.4266485Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpif12e1ra 2022-09-27T16:02:07.4269244Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpif12e1ra/_remote_module_non_scriptable.py 2022-09-27T16:02:07.4339955Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpog52dhvb 2022-09-27T16:02:07.4342840Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpog52dhvb/_remote_module_non_scriptable.py 2022-09-27T16:02:07.8629880Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:02:07.8704470Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:02:07.8755859Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:02:07.8839569Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:02:07.9846750Z fi_getinfo: -61 2022-09-27T16:02:07.9919144Z fi_getinfo: -61 2022-09-27T16:02:07.9971537Z fi_getinfo: -61 2022-09-27T16:02:08.0054862Z fi_getinfo: -61 2022-09-27T16:02:08.4030033Z ok (4.294s) 2022-09-27T16:02:08.4030284Z 2022-09-27T16:02:08.4031131Z ---------------------------------------------------------------------- 2022-09-27T16:02:08.4031481Z Ran 1 test in 4.295s 2022-09-27T16:02:08.4031645Z 2022-09-27T16:02:08.4031742Z OK 2022-09-27T16:02:08.4031888Z 2022-09-27T16:02:08.4032122Z Generating XML reports... 2022-09-27T16:02:08.4067262Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160204.xml 2022-09-27T16:02:10.3452971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:10.3453512Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:10.3455728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:10.3456223Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:10.5850153Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1apjouw4 2022-09-27T16:02:10.5852421Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1apjouw4/_remote_module_non_scriptable.py 2022-09-27T16:02:11.0295519Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:02:11.0312134Z 2022-09-27T16:02:11.0312420Z Running tests... 2022-09-27T16:02:11.0312861Z ---------------------------------------------------------------------- 2022-09-27T16:02:12.5174845Z test_device_maps_many_to_one (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:02:12.5352653Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22533 2022-09-27T16:02:12.5359269Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22534 2022-09-27T16:02:12.5365654Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 22535 2022-09-27T16:02:12.5371815Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 22536 2022-09-27T16:02:14.1268045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:14.1268982Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:14.1269998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:14.1270476Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:14.1272094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:14.1272556Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:14.1276753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:14.1277236Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:14.1333125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:14.1333588Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:14.1337380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:14.1337860Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:14.1479547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:14.1480007Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:14.1484400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:14.1484880Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:14.3636390Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq2c2bova 2022-09-27T16:02:14.3637671Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq2c2bova/_remote_module_non_scriptable.py 2022-09-27T16:02:14.3663934Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbhb6awa7 2022-09-27T16:02:14.3666904Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbhb6awa7/_remote_module_non_scriptable.py 2022-09-27T16:02:14.3696470Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfdpbys5e 2022-09-27T16:02:14.3699235Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfdpbys5e/_remote_module_non_scriptable.py 2022-09-27T16:02:14.3840687Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnq8_lrtg 2022-09-27T16:02:14.3843970Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnq8_lrtg/_remote_module_non_scriptable.py 2022-09-27T16:02:14.8141988Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:02:14.8162135Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:02:14.8167371Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:02:14.8349785Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:02:14.9358626Z fi_getinfo: -61 2022-09-27T16:02:14.9376076Z fi_getinfo: -61 2022-09-27T16:02:14.9379919Z fi_getinfo: -61 2022-09-27T16:02:14.9565817Z fi_getinfo: -61 2022-09-27T16:02:15.4452320Z ok (4.414s) 2022-09-27T16:02:15.4452560Z 2022-09-27T16:02:15.4453214Z ---------------------------------------------------------------------- 2022-09-27T16:02:15.4453588Z Ran 1 test in 4.414s 2022-09-27T16:02:15.4453757Z 2022-09-27T16:02:15.4453854Z OK 2022-09-27T16:02:15.4453990Z 2022-09-27T16:02:15.4454130Z Generating XML reports... 2022-09-27T16:02:15.4489419Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160211.xml 2022-09-27T16:02:17.4543354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:17.4543872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:17.4545971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:17.4546783Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:17.6884619Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplriefubx 2022-09-27T16:02:17.6885860Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplriefubx/_remote_module_non_scriptable.py 2022-09-27T16:02:18.1198639Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:02:18.1213608Z 2022-09-27T16:02:18.1213867Z Running tests... 2022-09-27T16:02:18.1214320Z ---------------------------------------------------------------------- 2022-09-27T16:02:19.5818219Z test_device_maps_missing_config (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:02:19.5996906Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22720 2022-09-27T16:02:19.6003294Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22721 2022-09-27T16:02:19.6009641Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 22722 2022-09-27T16:02:19.6016115Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 22723 2022-09-27T16:02:21.2102183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:21.2103177Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:21.2104303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:21.2105197Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:21.2106388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:21.2107310Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:21.2108479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:21.2109445Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:21.2374114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:21.2374994Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:21.2377992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:21.2378887Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:21.2926561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:21.2927541Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:21.2929478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:21.2930533Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:21.4692595Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk8_0zbtq 2022-09-27T16:02:21.4693797Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk8_0zbtq/_remote_module_non_scriptable.py 2022-09-27T16:02:21.4726102Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7rnnwwpz 2022-09-27T16:02:21.4728224Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7rnnwwpz/_remote_module_non_scriptable.py 2022-09-27T16:02:21.4730198Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkvbd1aoh 2022-09-27T16:02:21.4734925Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkvbd1aoh/_remote_module_non_scriptable.py 2022-09-27T16:02:21.5214694Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4zin458s 2022-09-27T16:02:21.5216579Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4zin458s/_remote_module_non_scriptable.py 2022-09-27T16:02:21.9169240Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:02:21.9257422Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:02:21.9261270Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:02:21.9659428Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:02:22.0392614Z fi_getinfo: -61 2022-09-27T16:02:22.0476206Z fi_getinfo: -61 2022-09-27T16:02:22.0481003Z fi_getinfo: -61 2022-09-27T16:02:22.0875097Z fi_getinfo: -61 2022-09-27T16:02:24.5130743Z ok (6.391s) 2022-09-27T16:02:24.5131058Z 2022-09-27T16:02:24.5131681Z ---------------------------------------------------------------------- 2022-09-27T16:02:24.5132036Z Ran 1 test in 6.392s 2022-09-27T16:02:24.5132199Z 2022-09-27T16:02:24.5132300Z OK 2022-09-27T16:02:24.5132431Z 2022-09-27T16:02:24.5132566Z Generating XML reports... 2022-09-27T16:02:24.5171182Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160218.xml 2022-09-27T16:02:26.4953233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:26.4953882Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:26.4956068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:26.4956528Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:26.7239690Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzulbsasy 2022-09-27T16:02:26.7240615Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzulbsasy/_remote_module_non_scriptable.py 2022-09-27T16:02:27.1531793Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:02:27.1546211Z 2022-09-27T16:02:27.1546547Z Running tests... 2022-09-27T16:02:27.1546980Z ---------------------------------------------------------------------- 2022-09-27T16:02:28.6132987Z test_device_maps_missing_config_loop (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:02:28.6311115Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23067 2022-09-27T16:02:28.6318204Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23068 2022-09-27T16:02:28.6325189Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 23069 2022-09-27T16:02:28.6331497Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 23070 2022-09-27T16:02:30.2226874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:30.2227415Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:30.2228562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:30.2229017Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:30.2229603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:30.2230072Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:30.2232376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:30.2232831Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:30.2236667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:30.2237282Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:30.2241465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:30.2241926Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:30.2471349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:30.2472251Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:30.2474337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:30.2475202Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:30.4443648Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2g1j0z4m 2022-09-27T16:02:30.4444817Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2g1j0z4m/_remote_module_non_scriptable.py 2022-09-27T16:02:30.4868201Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx11pgcpe 2022-09-27T16:02:30.4869355Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx11pgcpe/_remote_module_non_scriptable.py 2022-09-27T16:02:30.4894188Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp07lbo9hi 2022-09-27T16:02:30.4896434Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp07lbo9hi/_remote_module_non_scriptable.py 2022-09-27T16:02:30.4960344Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzbpc5yjp 2022-09-27T16:02:30.4962607Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzbpc5yjp/_remote_module_non_scriptable.py 2022-09-27T16:02:30.8767195Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:02:30.9304731Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:02:30.9322788Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:02:30.9452852Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:02:31.0079507Z fi_getinfo: -61 2022-09-27T16:02:31.0529475Z fi_getinfo: -61 2022-09-27T16:02:31.0538698Z fi_getinfo: -61 2022-09-27T16:02:31.0665783Z fi_getinfo: -61 2022-09-27T16:02:33.5442512Z ok (6.389s) 2022-09-27T16:02:33.5442727Z 2022-09-27T16:02:33.5443142Z ---------------------------------------------------------------------- 2022-09-27T16:02:33.5443465Z Ran 1 test in 6.389s 2022-09-27T16:02:33.5443632Z 2022-09-27T16:02:33.5443728Z OK 2022-09-27T16:02:33.5443865Z 2022-09-27T16:02:33.5444001Z Generating XML reports... 2022-09-27T16:02:33.5481022Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160227.xml 2022-09-27T16:02:35.5212553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:35.5213071Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:35.5214738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:35.5215240Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:35.7597571Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpts6dlkjp 2022-09-27T16:02:35.7598721Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpts6dlkjp/_remote_module_non_scriptable.py 2022-09-27T16:02:36.2128595Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:02:36.2143750Z 2022-09-27T16:02:36.2143993Z Running tests... 2022-09-27T16:02:36.2144424Z ---------------------------------------------------------------------- 2022-09-27T16:02:37.7047138Z test_device_maps_missing_config_not_timeout (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:02:37.7224910Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23414 2022-09-27T16:02:37.7231138Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23415 2022-09-27T16:02:37.7238244Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 23416 2022-09-27T16:02:37.7245024Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 23417 2022-09-27T16:02:39.3429758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:39.3430285Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:39.3431617Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:39.3432074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:39.3432648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:39.3433120Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:39.3436708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:39.3437191Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:39.3455280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:39.3455753Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:39.3459457Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:39.3459933Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:39.3903793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:39.3904275Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:39.3907479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:39.3907951Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:39.5696744Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxertruqf 2022-09-27T16:02:39.5697659Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxertruqf/_remote_module_non_scriptable.py 2022-09-27T16:02:39.5954263Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk_0lqj70 2022-09-27T16:02:39.5956985Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk_0lqj70/_remote_module_non_scriptable.py 2022-09-27T16:02:39.6008886Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp16i335c_ 2022-09-27T16:02:39.6012367Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp16i335c_/_remote_module_non_scriptable.py 2022-09-27T16:02:39.6195583Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoj1fslkz 2022-09-27T16:02:39.6198622Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoj1fslkz/_remote_module_non_scriptable.py 2022-09-27T16:02:40.0082799Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:02:40.0368499Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:02:40.0456163Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:02:40.0676405Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:02:40.1401574Z fi_getinfo: -61 2022-09-27T16:02:40.1583825Z fi_getinfo: -61 2022-09-27T16:02:40.1676812Z fi_getinfo: -61 2022-09-27T16:02:40.1889557Z fi_getinfo: -61 2022-09-27T16:02:42.6366999Z ok (6.422s) 2022-09-27T16:02:42.6367439Z 2022-09-27T16:02:42.6368123Z ---------------------------------------------------------------------- 2022-09-27T16:02:42.6368736Z Ran 1 test in 6.422s 2022-09-27T16:02:42.6369027Z 2022-09-27T16:02:42.6369188Z OK 2022-09-27T16:02:42.6369430Z 2022-09-27T16:02:42.6369662Z Generating XML reports... 2022-09-27T16:02:42.6407788Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160236.xml 2022-09-27T16:02:44.6092595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:44.6093078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:44.6095225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:44.6095716Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:44.8482847Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk4sp1il4 2022-09-27T16:02:44.8484090Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk4sp1il4/_remote_module_non_scriptable.py 2022-09-27T16:02:45.2936349Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:02:45.2951523Z 2022-09-27T16:02:45.2951864Z Running tests... 2022-09-27T16:02:45.2952349Z ---------------------------------------------------------------------- 2022-09-27T16:02:46.8048653Z test_device_maps_missing_config_remote (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:02:46.8224361Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23761 2022-09-27T16:02:46.8230731Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23762 2022-09-27T16:02:46.8238261Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 23763 2022-09-27T16:02:46.8244391Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 23764 2022-09-27T16:02:48.4144466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:48.4144960Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:48.4146327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:48.4146809Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:48.4201213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:48.4201655Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:48.4205380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:48.4205856Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:48.4791194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:48.4791691Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:48.4794497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:48.4794972Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:48.4838123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:48.4838569Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:48.4842134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:48.4842763Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:48.6415900Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn1ky_ia3 2022-09-27T16:02:48.6416841Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn1ky_ia3/_remote_module_non_scriptable.py 2022-09-27T16:02:48.6548875Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgkr6i_58 2022-09-27T16:02:48.6551558Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgkr6i_58/_remote_module_non_scriptable.py 2022-09-27T16:02:48.7040682Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptil3i_5b 2022-09-27T16:02:48.7041743Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptil3i_5b/_remote_module_non_scriptable.py 2022-09-27T16:02:48.7152215Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6qo2x_l_ 2022-09-27T16:02:48.7154964Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6qo2x_l_/_remote_module_non_scriptable.py 2022-09-27T16:02:49.0824473Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:02:49.0888943Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:02:49.1443920Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:02:49.1616118Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:02:49.2040454Z fi_getinfo: -61 2022-09-27T16:02:49.2102327Z fi_getinfo: -61 2022-09-27T16:02:49.2656923Z fi_getinfo: -61 2022-09-27T16:02:49.2829318Z fi_getinfo: -61 2022-09-27T16:02:51.7356646Z ok (6.440s) 2022-09-27T16:02:51.7356984Z 2022-09-27T16:02:51.7357613Z ---------------------------------------------------------------------- 2022-09-27T16:02:51.7358221Z Ran 1 test in 6.440s 2022-09-27T16:02:51.7358472Z 2022-09-27T16:02:51.7358635Z OK 2022-09-27T16:02:51.7358869Z 2022-09-27T16:02:51.7359083Z Generating XML reports... 2022-09-27T16:02:51.7393765Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160245.xml 2022-09-27T16:02:53.7080645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:53.7081146Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:53.7083440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:53.7083918Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:53.9347077Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn915o3ca 2022-09-27T16:02:53.9348393Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn915o3ca/_remote_module_non_scriptable.py 2022-09-27T16:02:54.3956572Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:02:54.3971449Z 2022-09-27T16:02:54.3972078Z Running tests... 2022-09-27T16:02:54.3972541Z ---------------------------------------------------------------------- 2022-09-27T16:02:55.8729033Z test_device_maps_missing_config_remote_response (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:02:55.8906624Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24108 2022-09-27T16:02:55.8913621Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24109 2022-09-27T16:02:55.8920379Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 24110 2022-09-27T16:02:55.8926552Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 24111 2022-09-27T16:02:57.4792821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:57.4793729Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:57.4794828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:57.4795381Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:57.4795958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:57.4796427Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:57.4797012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:57.4797461Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:57.5079376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:57.5079848Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:57.5083047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:57.5083534Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:57.5122200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:02:57.5122659Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:02:57.5126073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:02:57.5126551Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:02:57.7389100Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe9g2gc0e 2022-09-27T16:02:57.7389709Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0lg1lviv 2022-09-27T16:02:57.7390257Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe9g2gc0e/_remote_module_non_scriptable.py 2022-09-27T16:02:57.7391145Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0lg1lviv/_remote_module_non_scriptable.py 2022-09-27T16:02:57.7399059Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkafydl1v 2022-09-27T16:02:57.7401652Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkafydl1v/_remote_module_non_scriptable.py 2022-09-27T16:02:57.7416099Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpywwouzea 2022-09-27T16:02:57.7419184Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpywwouzea/_remote_module_non_scriptable.py 2022-09-27T16:02:58.1810755Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:02:58.1853317Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:02:58.1869424Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:02:58.1870562Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:02:58.3028330Z fi_getinfo: -61 2022-09-27T16:02:58.3071191Z fi_getinfo: -61 2022-09-27T16:02:58.3088774Z fi_getinfo: -61 2022-09-27T16:02:58.3092379Z fi_getinfo: -61 2022-09-27T16:03:00.7044407Z ok (6.307s) 2022-09-27T16:03:00.7044628Z 2022-09-27T16:03:00.7045038Z ---------------------------------------------------------------------- 2022-09-27T16:03:00.7045362Z Ran 1 test in 6.307s 2022-09-27T16:03:00.7045528Z 2022-09-27T16:03:00.7045625Z OK 2022-09-27T16:03:00.7045760Z 2022-09-27T16:03:00.7045898Z Generating XML reports... 2022-09-27T16:03:00.7081083Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160254.xml 2022-09-27T16:03:02.6475742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:02.6476683Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:02.6478052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:02.6478530Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:02.8901299Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzpwwuq53 2022-09-27T16:03:02.8902236Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzpwwuq53/_remote_module_non_scriptable.py 2022-09-27T16:03:03.3266512Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:03:03.3282071Z 2022-09-27T16:03:03.3282320Z Running tests... 2022-09-27T16:03:03.3282761Z ---------------------------------------------------------------------- 2022-09-27T16:03:04.8269682Z test_device_maps_missing_config_response (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:03:04.8452864Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24455 2022-09-27T16:03:04.8459235Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24456 2022-09-27T16:03:04.8465804Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 24457 2022-09-27T16:03:04.8472954Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 24458 2022-09-27T16:03:06.4780672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:06.4781184Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:06.4782695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:06.4783166Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:06.4786938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:06.4787394Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:06.4790920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:06.4791591Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:06.5081796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:06.5082253Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:06.5086107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:06.5086573Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:06.5121788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:06.5122522Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:06.5125946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:06.5126401Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:06.7029306Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3y2afl1g 2022-09-27T16:03:06.7030197Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3y2afl1g/_remote_module_non_scriptable.py 2022-09-27T16:03:06.7293125Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp01trqq0p 2022-09-27T16:03:06.7295643Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp01trqq0p/_remote_module_non_scriptable.py 2022-09-27T16:03:06.7411035Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd7hn2p0_ 2022-09-27T16:03:06.7414118Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd7hn2p0_/_remote_module_non_scriptable.py 2022-09-27T16:03:06.7448731Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1_hrfc5r 2022-09-27T16:03:06.7451380Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1_hrfc5r/_remote_module_non_scriptable.py 2022-09-27T16:03:07.1407363Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:03:07.1753410Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:03:07.1940411Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:03:07.1942576Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:03:07.2627825Z fi_getinfo: -61 2022-09-27T16:03:07.2968205Z fi_getinfo: -61 2022-09-27T16:03:07.3161885Z fi_getinfo: -61 2022-09-27T16:03:07.3166314Z fi_getinfo: -61 2022-09-27T16:03:09.6586248Z ok (6.330s) 2022-09-27T16:03:09.6586494Z 2022-09-27T16:03:09.6586886Z ---------------------------------------------------------------------- 2022-09-27T16:03:09.6587234Z Ran 1 test in 6.330s 2022-09-27T16:03:09.6587393Z 2022-09-27T16:03:09.6587471Z OK 2022-09-27T16:03:09.6587608Z 2022-09-27T16:03:09.6587744Z Generating XML reports... 2022-09-27T16:03:09.6622963Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160303.xml 2022-09-27T16:03:11.6332207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:11.6332872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:11.6334175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:11.6334675Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:11.8633219Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb7jgofr3 2022-09-27T16:03:11.8634753Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb7jgofr3/_remote_module_non_scriptable.py 2022-09-27T16:03:12.2825461Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:03:12.2840571Z 2022-09-27T16:03:12.2840938Z Running tests... 2022-09-27T16:03:12.2841443Z ---------------------------------------------------------------------- 2022-09-27T16:03:13.7485974Z test_device_maps_missing_config_response_loop (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:03:13.7661486Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24802 2022-09-27T16:03:13.7667669Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24803 2022-09-27T16:03:13.7674144Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 24804 2022-09-27T16:03:13.7680888Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 24805 2022-09-27T16:03:15.3432890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:15.3433394Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:15.3434862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:15.3435323Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:15.3465740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:15.3466208Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:15.3469559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:15.3470025Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:15.3488025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:15.3488485Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:15.3492152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:15.3492610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:15.3748006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:15.3748465Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:15.3752230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:15.3752701Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:15.5847051Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp79hz2n8t 2022-09-27T16:03:15.5848224Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp79hz2n8t/_remote_module_non_scriptable.py 2022-09-27T16:03:15.5854886Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiilh5lzl 2022-09-27T16:03:15.5857655Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiilh5lzl/_remote_module_non_scriptable.py 2022-09-27T16:03:15.5883658Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp727tl3h5 2022-09-27T16:03:15.5886232Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp727tl3h5/_remote_module_non_scriptable.py 2022-09-27T16:03:15.6067478Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplzdw82n0 2022-09-27T16:03:15.6069009Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplzdw82n0/_remote_module_non_scriptable.py 2022-09-27T16:03:16.0322998Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:03:16.0326739Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:03:16.0338743Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:03:16.0564768Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:03:16.1541188Z fi_getinfo: -61 2022-09-27T16:03:16.1544503Z fi_getinfo: -61 2022-09-27T16:03:16.1552213Z fi_getinfo: -61 2022-09-27T16:03:16.1777033Z fi_getinfo: -61 2022-09-27T16:03:18.6801808Z ok (6.396s) 2022-09-27T16:03:18.6802030Z 2022-09-27T16:03:18.6802646Z ---------------------------------------------------------------------- 2022-09-27T16:03:18.6803034Z Ran 1 test in 6.396s 2022-09-27T16:03:18.6803196Z 2022-09-27T16:03:18.6803298Z OK 2022-09-27T16:03:18.6803434Z 2022-09-27T16:03:18.6804916Z Generating XML reports... 2022-09-27T16:03:18.6838671Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160312.xml 2022-09-27T16:03:20.6599165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:20.6599674Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:20.6601552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:20.6602212Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:20.8866544Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwqxjwtxr 2022-09-27T16:03:20.8867796Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwqxjwtxr/_remote_module_non_scriptable.py 2022-09-27T16:03:21.3215579Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:03:21.3230038Z 2022-09-27T16:03:21.3230331Z Running tests... 2022-09-27T16:03:21.3231182Z ---------------------------------------------------------------------- 2022-09-27T16:03:22.7676013Z test_device_maps_multi_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:03:22.7852415Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25149 2022-09-27T16:03:22.7858837Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25150 2022-09-27T16:03:22.7865016Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 25151 2022-09-27T16:03:22.7871720Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 25152 2022-09-27T16:03:24.3761862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:24.3762372Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:24.3763496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:24.3763974Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:24.3843564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:24.3844004Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:24.3847478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:24.3847961Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:24.3860088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:24.3860539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:24.3864120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:24.3864600Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:24.4082168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:24.4082616Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:24.4086263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:24.4086741Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:24.6223410Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph21ujqfd 2022-09-27T16:03:24.6224313Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph21ujqfd/_remote_module_non_scriptable.py 2022-09-27T16:03:24.6240578Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5x12j_ic 2022-09-27T16:03:24.6243270Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5x12j_ic/_remote_module_non_scriptable.py 2022-09-27T16:03:24.6256375Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvrianwaw 2022-09-27T16:03:24.6259423Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvrianwaw/_remote_module_non_scriptable.py 2022-09-27T16:03:24.6384630Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp6elc0f9 2022-09-27T16:03:24.6387928Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp6elc0f9/_remote_module_non_scriptable.py 2022-09-27T16:03:25.0693922Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:03:25.0697233Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:03:25.0726113Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:03:25.0921928Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:03:25.1912681Z fi_getinfo: -61 2022-09-27T16:03:25.1916674Z fi_getinfo: -61 2022-09-27T16:03:25.1940336Z fi_getinfo: -61 2022-09-27T16:03:25.2134517Z fi_getinfo: -61 2022-09-27T16:03:31.1049885Z ok (9.782s) 2022-09-27T16:03:31.1050267Z 2022-09-27T16:03:31.1050719Z ---------------------------------------------------------------------- 2022-09-27T16:03:31.1051046Z Ran 1 test in 9.782s 2022-09-27T16:03:31.1051216Z 2022-09-27T16:03:31.1051314Z OK 2022-09-27T16:03:31.1051450Z 2022-09-27T16:03:31.1051580Z Generating XML reports... 2022-09-27T16:03:31.1086691Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160321.xml 2022-09-27T16:03:33.1018459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:33.1018966Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:33.1021377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:33.1021851Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:33.3342366Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5bz5jnac 2022-09-27T16:03:33.3343768Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5bz5jnac/_remote_module_non_scriptable.py 2022-09-27T16:03:33.7671059Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:03:33.7685535Z 2022-09-27T16:03:33.7685824Z Running tests... 2022-09-27T16:03:33.7686370Z ---------------------------------------------------------------------- 2022-09-27T16:03:35.2292661Z test_device_maps_multi_gpu_self (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:03:35.2470929Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25508 2022-09-27T16:03:35.2477344Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25509 2022-09-27T16:03:35.2483924Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 25510 2022-09-27T16:03:35.2490600Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 25511 2022-09-27T16:03:36.8354380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:36.8354874Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:36.8357192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:36.8357713Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:36.8373526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:36.8374146Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:36.8377137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:36.8377620Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:36.8723217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:36.8723688Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:36.8726467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:36.8727126Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:36.8894265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:36.8894721Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:36.8898315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:36.8898774Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:37.0605804Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_xschrha 2022-09-27T16:03:37.0606627Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_xschrha/_remote_module_non_scriptable.py 2022-09-27T16:03:37.0795537Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg04rx7wc 2022-09-27T16:03:37.0798153Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg04rx7wc/_remote_module_non_scriptable.py 2022-09-27T16:03:37.1066369Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4nt14s32 2022-09-27T16:03:37.1069030Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4nt14s32/_remote_module_non_scriptable.py 2022-09-27T16:03:37.1119853Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprm7qn7mo 2022-09-27T16:03:37.1122603Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprm7qn7mo/_remote_module_non_scriptable.py 2022-09-27T16:03:37.4957212Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:03:37.5277638Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:03:37.5516001Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:03:37.5553020Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:03:37.6280373Z fi_getinfo: -61 2022-09-27T16:03:37.6505364Z fi_getinfo: -61 2022-09-27T16:03:37.6729903Z fi_getinfo: -61 2022-09-27T16:03:37.6768088Z fi_getinfo: -61 2022-09-27T16:03:43.5671738Z ok (9.798s) 2022-09-27T16:03:43.5672090Z 2022-09-27T16:03:43.5672583Z ---------------------------------------------------------------------- 2022-09-27T16:03:43.5672930Z Ran 1 test in 9.798s 2022-09-27T16:03:43.5673092Z 2022-09-27T16:03:43.5673170Z OK 2022-09-27T16:03:43.5673305Z 2022-09-27T16:03:43.5673438Z Generating XML reports... 2022-09-27T16:03:43.5709196Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160333.xml 2022-09-27T16:03:45.5165183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:45.5166184Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:45.5167440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:45.5168750Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:45.7451427Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp68_cxvc4 2022-09-27T16:03:45.7452393Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp68_cxvc4/_remote_module_non_scriptable.py 2022-09-27T16:03:46.1731810Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:03:46.1746778Z 2022-09-27T16:03:46.1747108Z Running tests... 2022-09-27T16:03:46.1748183Z ---------------------------------------------------------------------- 2022-09-27T16:03:47.6402525Z test_device_maps_one_to_many (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:03:47.6580563Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25859 2022-09-27T16:03:47.6587008Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25860 2022-09-27T16:03:47.6593617Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 25861 2022-09-27T16:03:47.6600803Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 25862 2022-09-27T16:03:49.2508663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:49.2509650Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:49.2512263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:49.2513207Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:49.2641577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:49.2642089Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:49.2645233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:49.2645733Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:49.2667208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:49.2667671Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:49.2671282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:49.2671992Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:49.2672581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:49.2673028Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:49.2676172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:49.2676655Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:49.5089966Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp75y1ci3x 2022-09-27T16:03:49.5091187Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp75y1ci3x/_remote_module_non_scriptable.py 2022-09-27T16:03:49.5106574Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmyp7m4lw 2022-09-27T16:03:49.5108664Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmyp7m4lw/_remote_module_non_scriptable.py 2022-09-27T16:03:49.5138022Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplrvj6pj3 2022-09-27T16:03:49.5140362Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplrvj6pj3/_remote_module_non_scriptable.py 2022-09-27T16:03:49.5142003Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1bi3bpog 2022-09-27T16:03:49.5145744Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1bi3bpog/_remote_module_non_scriptable.py 2022-09-27T16:03:49.9600501Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:03:49.9619674Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:03:49.9641727Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:03:49.9651456Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:03:50.0860539Z fi_getinfo: -61 2022-09-27T16:03:50.4679965Z ok (4.293s) 2022-09-27T16:03:50.4680169Z 2022-09-27T16:03:50.4680566Z ---------------------------------------------------------------------- 2022-09-27T16:03:50.4680923Z Ran 1 test in 4.293s 2022-09-27T16:03:50.4681387Z 2022-09-27T16:03:50.4681491Z OK 2022-09-27T16:03:50.4681622Z 2022-09-27T16:03:50.4681761Z Generating XML reports... 2022-09-27T16:03:50.4716022Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160346.xml 2022-09-27T16:03:52.4604229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:52.4604726Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:52.4607863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:52.4608374Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:52.6963010Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2kar0_ot 2022-09-27T16:03:52.6964393Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2kar0_ot/_remote_module_non_scriptable.py 2022-09-27T16:03:53.1354695Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:03:53.1371237Z 2022-09-27T16:03:53.1371509Z Running tests... 2022-09-27T16:03:53.1371965Z ---------------------------------------------------------------------- 2022-09-27T16:03:54.6297526Z test_device_maps_remote (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:03:54.6483062Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26031 2022-09-27T16:03:54.6489443Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26032 2022-09-27T16:03:54.6496650Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 26033 2022-09-27T16:03:54.6503303Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 26034 2022-09-27T16:03:56.2349779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:56.2350302Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:56.2351711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:56.2352201Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:56.2888272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:56.2888811Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:56.2891736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:56.2892220Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:56.3258471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:56.3259230Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:56.3261450Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:56.3262189Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:56.3275042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:03:56.3275509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:03:56.3279251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:03:56.3279732Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:03:56.4768380Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzr6yws0r 2022-09-27T16:03:56.4769254Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzr6yws0r/_remote_module_non_scriptable.py 2022-09-27T16:03:56.5079702Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp17fnp09z 2022-09-27T16:03:56.5081791Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp17fnp09z/_remote_module_non_scriptable.py 2022-09-27T16:03:56.5650149Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprco72q36 2022-09-27T16:03:56.5652140Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprco72q36/_remote_module_non_scriptable.py 2022-09-27T16:03:56.5652682Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv4jhj25o 2022-09-27T16:03:56.5655316Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv4jhj25o/_remote_module_non_scriptable.py 2022-09-27T16:03:56.9160425Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:03:56.9370306Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:03:57.0140427Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:03:57.0199638Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:03:57.0374442Z fi_getinfo: -61 2022-09-27T16:03:57.0580182Z fi_getinfo: -61 2022-09-27T16:03:57.1356182Z fi_getinfo: -61 2022-09-27T16:03:57.1414070Z fi_getinfo: -61 2022-09-27T16:04:02.9697763Z ok (9.832s) 2022-09-27T16:04:02.9698198Z 2022-09-27T16:04:02.9698679Z ---------------------------------------------------------------------- 2022-09-27T16:04:02.9699072Z Ran 1 test in 9.832s 2022-09-27T16:04:02.9699251Z 2022-09-27T16:04:02.9699347Z OK 2022-09-27T16:04:02.9699483Z 2022-09-27T16:04:02.9699624Z Generating XML reports... 2022-09-27T16:04:02.9735285Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160353.xml 2022-09-27T16:04:04.9592887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:04.9593756Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:04.9595882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:04.9596392Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:05.1890910Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3rl2sxph 2022-09-27T16:04:05.1892207Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3rl2sxph/_remote_module_non_scriptable.py 2022-09-27T16:04:05.6141942Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:04:05.6159466Z 2022-09-27T16:04:05.6159628Z Running tests... 2022-09-27T16:04:05.6160348Z ---------------------------------------------------------------------- 2022-09-27T16:04:07.0587010Z test_device_maps_return_to_gpu (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:04:07.0765416Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26390 2022-09-27T16:04:07.0771878Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26391 2022-09-27T16:04:07.0778666Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 26392 2022-09-27T16:04:07.0784486Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 26393 2022-09-27T16:04:08.6611990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:08.6612515Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:08.6613820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:08.6614281Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:08.6650130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:08.6650608Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:08.6653836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:08.6654299Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:08.6683173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:08.6683636Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:08.6687816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:08.6688278Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:08.7058818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:08.7059286Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:08.7062393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:08.7062875Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:08.8931485Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd4yv3iui 2022-09-27T16:04:08.8932435Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd4yv3iui/_remote_module_non_scriptable.py 2022-09-27T16:04:08.9000663Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp9_be3e2 2022-09-27T16:04:08.9003599Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp9_be3e2/_remote_module_non_scriptable.py 2022-09-27T16:04:08.9063093Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp99igbb8l 2022-09-27T16:04:08.9065806Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp99igbb8l/_remote_module_non_scriptable.py 2022-09-27T16:04:08.9253765Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptmbrst0i 2022-09-27T16:04:08.9256489Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptmbrst0i/_remote_module_non_scriptable.py 2022-09-27T16:04:09.3397595Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:04:09.3491540Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:04:09.3546730Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:04:09.3701131Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:04:09.7854084Z skip: Need at least 4 CUDA devices (4.169s) 2022-09-27T16:04:09.7854352Z 2022-09-27T16:04:09.7854725Z ---------------------------------------------------------------------- 2022-09-27T16:04:09.7855088Z Ran 1 test in 4.169s 2022-09-27T16:04:09.7855250Z 2022-09-27T16:04:09.7855644Z OK (skipped=1) 2022-09-27T16:04:09.7855825Z 2022-09-27T16:04:09.7855955Z Generating XML reports... 2022-09-27T16:04:09.7892881Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160405.xml 2022-09-27T16:04:11.7779288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:11.7779805Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:11.7780688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:11.7781147Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:12.0103753Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1f41o76q 2022-09-27T16:04:12.0104535Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1f41o76q/_remote_module_non_scriptable.py 2022-09-27T16:04:12.4374955Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:04:12.4389756Z 2022-09-27T16:04:12.4389990Z Running tests... 2022-09-27T16:04:12.4390436Z ---------------------------------------------------------------------- 2022-09-27T16:04:13.8940981Z test_device_maps_return_to_gpu_self (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:04:13.9118588Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26561 2022-09-27T16:04:13.9124867Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26562 2022-09-27T16:04:13.9130950Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 26563 2022-09-27T16:04:13.9142947Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 26564 2022-09-27T16:04:15.5085572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:15.5086086Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:15.5087316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:15.5087802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:15.5121930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:15.5122398Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:15.5125797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:15.5126279Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:15.5770801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:15.5771285Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:15.5774152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:15.5774636Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:15.6176628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:15.6177102Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:15.6179348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:15.6179811Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:15.7388280Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo6dy9gs4 2022-09-27T16:04:15.7389538Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo6dy9gs4/_remote_module_non_scriptable.py 2022-09-27T16:04:15.7455562Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkmxguj42 2022-09-27T16:04:15.7458375Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkmxguj42/_remote_module_non_scriptable.py 2022-09-27T16:04:15.8029128Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfmz52rrm 2022-09-27T16:04:15.8031795Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfmz52rrm/_remote_module_non_scriptable.py 2022-09-27T16:04:15.8451947Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdhct0qml 2022-09-27T16:04:15.8454508Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdhct0qml/_remote_module_non_scriptable.py 2022-09-27T16:04:16.1837223Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:04:16.1867163Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:04:16.2655925Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:04:16.3207420Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:04:16.7208161Z skip: Need at least 4 CUDA devices (4.281s) 2022-09-27T16:04:16.7208601Z 2022-09-27T16:04:16.7209017Z ---------------------------------------------------------------------- 2022-09-27T16:04:16.7209360Z Ran 1 test in 4.282s 2022-09-27T16:04:16.7209621Z 2022-09-27T16:04:16.7209816Z OK (skipped=1) 2022-09-27T16:04:16.7210033Z 2022-09-27T16:04:16.7210971Z Generating XML reports... 2022-09-27T16:04:16.7247248Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160412.xml 2022-09-27T16:04:18.6797201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:18.6797894Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:18.6799038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:18.6799504Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:18.9166305Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprj5fnr61 2022-09-27T16:04:18.9167639Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprj5fnr61/_remote_module_non_scriptable.py 2022-09-27T16:04:19.3570664Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:04:19.3586110Z 2022-09-27T16:04:19.3586358Z Running tests... 2022-09-27T16:04:19.3586788Z ---------------------------------------------------------------------- 2022-09-27T16:04:20.8452656Z test_device_maps_wrong_worker_name (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:04:20.8636567Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26732 2022-09-27T16:04:20.8642997Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26733 2022-09-27T16:04:20.8649762Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 26734 2022-09-27T16:04:20.8656438Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 26735 2022-09-27T16:04:22.4531949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:22.4532493Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:22.4534226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:22.4535227Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:22.4536095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:22.4536570Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:22.4539508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:22.4540266Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:22.4626095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:22.4626860Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:22.4629955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:22.4631114Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:22.4839420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:22.4840374Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:22.4843322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:22.4844063Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:22.6872299Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwi_ooerl 2022-09-27T16:04:22.6873517Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwi_ooerl/_remote_module_non_scriptable.py 2022-09-27T16:04:22.6941396Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpia4l30p_ 2022-09-27T16:04:22.6944316Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpia4l30p_/_remote_module_non_scriptable.py 2022-09-27T16:04:22.7040157Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp60obnzvr 2022-09-27T16:04:22.7042931Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp60obnzvr/_remote_module_non_scriptable.py 2022-09-27T16:04:22.7162794Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp9q6fp8b 2022-09-27T16:04:22.7165548Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp9q6fp8b/_remote_module_non_scriptable.py 2022-09-27T16:04:23.1335920Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:04:23.1397261Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:04:23.1538136Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:04:23.1673614Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:04:23.2552119Z fi_getinfo: -61 2022-09-27T16:04:23.2611911Z fi_getinfo: -61 2022-09-27T16:04:23.2752369Z fi_getinfo: -61 2022-09-27T16:04:23.2888022Z fi_getinfo: -61 2022-09-27T16:04:23.7728168Z ok (4.414s) 2022-09-27T16:04:23.7728535Z 2022-09-27T16:04:23.7729324Z ---------------------------------------------------------------------- 2022-09-27T16:04:23.7729832Z Ran 1 test in 4.414s 2022-09-27T16:04:23.7730003Z 2022-09-27T16:04:23.7730104Z OK 2022-09-27T16:04:23.7730244Z 2022-09-27T16:04:23.7730361Z Generating XML reports... 2022-09-27T16:04:23.7773447Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160419.xml 2022-09-27T16:04:25.7451260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:25.7451783Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:25.7453428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:25.7453986Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:25.9838978Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4nikaf0h 2022-09-27T16:04:25.9839813Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4nikaf0h/_remote_module_non_scriptable.py 2022-09-27T16:04:26.4238489Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:04:26.4254052Z 2022-09-27T16:04:26.4254272Z Running tests... 2022-09-27T16:04:26.4254963Z ---------------------------------------------------------------------- 2022-09-27T16:04:27.8933805Z test_device_mismatch (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:04:27.9113932Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26919 2022-09-27T16:04:27.9120494Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26920 2022-09-27T16:04:27.9126757Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 26921 2022-09-27T16:04:27.9133373Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 26922 2022-09-27T16:04:29.5037117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:29.5037605Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:29.5038735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:29.5039214Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:29.5115773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:29.5116219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:29.5119767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:29.5120257Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:29.5148660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:29.5149102Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:29.5152821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:29.5153293Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:29.5440945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:29.5441382Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:29.5445032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:29.5445511Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:29.7439635Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8tgnn29u 2022-09-27T16:04:29.7440561Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8tgnn29u/_remote_module_non_scriptable.py 2022-09-27T16:04:29.7476116Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwg8mxqcv 2022-09-27T16:04:29.7478875Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwg8mxqcv/_remote_module_non_scriptable.py 2022-09-27T16:04:29.7538314Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp26bg3ezs 2022-09-27T16:04:29.7541016Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp26bg3ezs/_remote_module_non_scriptable.py 2022-09-27T16:04:29.7776717Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw_4_8ahh 2022-09-27T16:04:29.7779620Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw_4_8ahh/_remote_module_non_scriptable.py 2022-09-27T16:04:30.1879228Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:04:30.1897260Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:04:30.1926692Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:04:30.2307487Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:04:30.3096566Z fi_getinfo: -61 2022-09-27T16:04:30.3112511Z fi_getinfo: -61 2022-09-27T16:04:30.3139104Z fi_getinfo: -61 2022-09-27T16:04:30.3521934Z fi_getinfo: -61 2022-09-27T16:04:33.2457567Z On WorkerInfo(id=1, name=worker1): 2022-09-27T16:04:33.2488864Z RuntimeError('Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!\nException raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fa3513b150b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7fa3513acede in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xc83 (0x7fa35ba3f8c3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7fa35ba40d8f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7fa35ba424b2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7fa35bd1ed0e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x2a6997e (0x7fa35405c97e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #7: + 0x2a69a86 (0x7fa35405ca86 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7fa35c760b38 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: + 0x33b04ca (0x7fa35df194ca in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x33b0c39 (0x7fa35df19c39 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7fa35c796f22 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x2ff7e7 (0x7fa368ba77e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #13: + 0x2ffb06 (0x7fa368ba7b06 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #14: + 0x1ddc68 (0x55d23fb65c68 in /opt/conda/bin/python)\nframe #15: + 0x199499 (0x55d23fb21499 in /opt/conda/bin/python)\nframe #16: + 0x1995fa (0x55d23fb215fa in /opt/conda/bin/python)\nframe #17: PyNumber_Add + 0x41 (0x55d23facd4b1 in /opt/conda/bin/python)\nframe #18: _PyEval_EvalFrameDefault + 0x1008 (0x55d23fb6a098 in /opt/conda/bin/python)\nframe #19: + 0x18f742 (0x55d23fb17742 in /opt/conda/bin/python)\nframe #20: _PyObject_Call + 0x20a (0x55d23facffaa in /opt/conda/bin/python)\nframe #21: _PyEval_EvalFrameDefault + 0x26e4 (0x55d23fb6b774 in /opt/conda/bin/python)\nframe #22: + 0x18f742 (0x55d23fb17742 in /opt/conda/bin/python)\nframe #23: _PyObject_Call + 0x20a (0x55d23facffaa in /opt/conda/bin/python)\nframe #24: + 0xa53d8a (0x7fa3692fbd8a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fa3692f9fcd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fa3692fd2a5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7fa3692feae6 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7fa35f323b7c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fa3692fd095 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #30: + 0x47b3f43 (0x7fa35f31cf43 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fa35f31dad8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fa35f317fd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #33: + 0x47e3a02 (0x7fa35f34ca02 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fa35139f93b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #35: + 0xc9039 (0x7fa3809c3039 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #36: + 0x76db (0x7fa3a0f736db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #37: clone + 0x3f (0x7fa3a0c9c61f in /lib/x86_64-linux-gnu/libc.so.6)\n') 2022-09-27T16:04:33.2506855Z Traceback (most recent call last): 2022-09-27T16:04:33.2508068Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 206, in _run_function 2022-09-27T16:04:33.2509135Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-09-27T16:04:33.2510572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/distributed/rpc/rpc_test.py", line 5911, in _gpu_add_wrong_gpus 2022-09-27T16:04:33.2511885Z return x.cpu() + y.cuda() 2022-09-27T16:04:33.2512780Z RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! 2022-09-27T16:04:33.2514054Z Exception raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first): 2022-09-27T16:04:33.2516079Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fa3513b150b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T16:04:33.2518608Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7fa3513acede in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T16:04:33.2520777Z frame #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xc83 (0x7fa35ba3f8c3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2522675Z frame #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7fa35ba40d8f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2524814Z frame #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7fa35ba424b2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2526960Z frame #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7fa35bd1ed0e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2528834Z frame #6: + 0x2a6997e (0x7fa35405c97e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-09-27T16:04:33.2530343Z frame #7: + 0x2a69a86 (0x7fa35405ca86 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-09-27T16:04:33.2532274Z frame #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7fa35c760b38 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2534008Z frame #9: + 0x33b04ca (0x7fa35df194ca in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2535525Z frame #10: + 0x33b0c39 (0x7fa35df19c39 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2537275Z frame #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7fa35c796f22 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2538948Z frame #12: + 0x2ff7e7 (0x7fa368ba77e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2540484Z frame #13: + 0x2ffb06 (0x7fa368ba7b06 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2541562Z frame #14: + 0x1ddc68 (0x55d23fb65c68 in /opt/conda/bin/python) 2022-09-27T16:04:33.2542445Z frame #15: + 0x199499 (0x55d23fb21499 in /opt/conda/bin/python) 2022-09-27T16:04:33.2543355Z frame #16: + 0x1995fa (0x55d23fb215fa in /opt/conda/bin/python) 2022-09-27T16:04:33.2544196Z frame #17: PyNumber_Add + 0x41 (0x55d23facd4b1 in /opt/conda/bin/python) 2022-09-27T16:04:33.2544925Z frame #18: _PyEval_EvalFrameDefault + 0x1008 (0x55d23fb6a098 in /opt/conda/bin/python) 2022-09-27T16:04:33.2545466Z frame #19: + 0x18f742 (0x55d23fb17742 in /opt/conda/bin/python) 2022-09-27T16:04:33.2546215Z frame #20: _PyObject_Call + 0x20a (0x55d23facffaa in /opt/conda/bin/python) 2022-09-27T16:04:33.2546989Z frame #21: _PyEval_EvalFrameDefault + 0x26e4 (0x55d23fb6b774 in /opt/conda/bin/python) 2022-09-27T16:04:33.2547752Z frame #22: + 0x18f742 (0x55d23fb17742 in /opt/conda/bin/python) 2022-09-27T16:04:33.2548459Z frame #23: _PyObject_Call + 0x20a (0x55d23facffaa in /opt/conda/bin/python) 2022-09-27T16:04:33.2549611Z frame #24: + 0xa53d8a (0x7fa3692fbd8a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2551716Z frame #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fa3692f9fcd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2554360Z frame #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fa3692fd2a5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2556598Z frame #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7fa3692feae6 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2558891Z frame #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7fa35f323b7c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2561277Z frame #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fa3692fd095 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2563375Z frame #30: + 0x47b3f43 (0x7fa35f31cf43 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2565613Z frame #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fa35f31dad8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2568251Z frame #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fa35f317fd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2570128Z frame #33: + 0x47e3a02 (0x7fa35f34ca02 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2571799Z frame #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fa35139f93b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T16:04:33.2572998Z frame #35: + 0xc9039 (0x7fa3809c3039 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-09-27T16:04:33.2574303Z frame #36: + 0x76db (0x7fa3a0f736db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-09-27T16:04:33.2575457Z frame #37: clone + 0x3f (0x7fa3a0c9c61f in /lib/x86_64-linux-gnu/libc.so.6) 2022-09-27T16:04:33.2575992Z 2022-09-27T16:04:33.2576026Z 2022-09-27T16:04:33.2664885Z On WorkerInfo(id=0, name=worker0): 2022-09-27T16:04:33.2695720Z RuntimeError('Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!\nException raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f95e776150b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f95e775cede in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xc83 (0x7f95f1def8c3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7f95f1df0d8f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7f95f1df24b2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7f95f20ced0e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x2a6997e (0x7f95ea40c97e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #7: + 0x2a69a86 (0x7f95ea40ca86 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7f95f2b10b38 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: + 0x33b04ca (0x7f95f42c94ca in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x33b0c39 (0x7f95f42c9c39 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7f95f2b46f22 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x2ff7e7 (0x7f95fef577e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #13: + 0x2ffb06 (0x7f95fef57b06 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #14: + 0x1ddc68 (0x5561be035c68 in /opt/conda/bin/python)\nframe #15: + 0x199499 (0x5561bdff1499 in /opt/conda/bin/python)\nframe #16: + 0x1995fa (0x5561bdff15fa in /opt/conda/bin/python)\nframe #17: PyNumber_Add + 0x41 (0x5561bdf9d4b1 in /opt/conda/bin/python)\nframe #18: _PyEval_EvalFrameDefault + 0x1008 (0x5561be03a098 in /opt/conda/bin/python)\nframe #19: + 0x18f742 (0x5561bdfe7742 in /opt/conda/bin/python)\nframe #20: _PyObject_Call + 0x20a (0x5561bdf9ffaa in /opt/conda/bin/python)\nframe #21: _PyEval_EvalFrameDefault + 0x26e4 (0x5561be03b774 in /opt/conda/bin/python)\nframe #22: + 0x18f742 (0x5561bdfe7742 in /opt/conda/bin/python)\nframe #23: _PyObject_Call + 0x20a (0x5561bdf9ffaa in /opt/conda/bin/python)\nframe #24: + 0xa53d8a (0x7f95ff6abd8a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f95ff6a9fcd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f95ff6ad2a5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7f95ff6aeae6 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7f95f56d3b7c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f95ff6ad095 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #30: + 0x47b3f43 (0x7f95f56ccf43 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f95f56cdad8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f95f56c7fd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #33: + 0x47e3a02 (0x7f95f56fca02 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f95e774f93b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #35: + 0xc9039 (0x7f9616d73039 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #36: + 0x76db (0x7f96373236db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #37: clone + 0x3f (0x7f963704c61f in /lib/x86_64-linux-gnu/libc.so.6)\n') 2022-09-27T16:04:33.2713953Z Traceback (most recent call last): 2022-09-27T16:04:33.2715218Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 206, in _run_function 2022-09-27T16:04:33.2716420Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-09-27T16:04:33.2717874Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/distributed/rpc/rpc_test.py", line 5911, in _gpu_add_wrong_gpus 2022-09-27T16:04:33.2718837Z return x.cpu() + y.cuda() 2022-09-27T16:04:33.2719803Z RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! 2022-09-27T16:04:33.2721056Z Exception raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first): 2022-09-27T16:04:33.2723048Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7f95e776150b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T16:04:33.2725339Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7f95e775cede in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T16:04:33.2727461Z frame #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xc83 (0x7f95f1def8c3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2729359Z frame #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7f95f1df0d8f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2731444Z frame #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7f95f1df24b2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2733571Z frame #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7f95f20ced0e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2735300Z frame #6: + 0x2a6997e (0x7f95ea40c97e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-09-27T16:04:33.2736847Z frame #7: + 0x2a69a86 (0x7f95ea40ca86 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-09-27T16:04:33.2738746Z frame #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7f95f2b10b38 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2740483Z frame #9: + 0x33b04ca (0x7f95f42c94ca in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2741988Z frame #10: + 0x33b0c39 (0x7f95f42c9c39 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2743755Z frame #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7f95f2b46f22 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2745420Z frame #12: + 0x2ff7e7 (0x7f95fef577e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2747040Z frame #13: + 0x2ffb06 (0x7f95fef57b06 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2748122Z frame #14: + 0x1ddc68 (0x5561be035c68 in /opt/conda/bin/python) 2022-09-27T16:04:33.2749052Z frame #15: + 0x199499 (0x5561bdff1499 in /opt/conda/bin/python) 2022-09-27T16:04:33.2749933Z frame #16: + 0x1995fa (0x5561bdff15fa in /opt/conda/bin/python) 2022-09-27T16:04:33.2751198Z frame #17: PyNumber_Add + 0x41 (0x5561bdf9d4b1 in /opt/conda/bin/python) 2022-09-27T16:04:33.2752193Z frame #18: _PyEval_EvalFrameDefault + 0x1008 (0x5561be03a098 in /opt/conda/bin/python) 2022-09-27T16:04:33.2753137Z frame #19: + 0x18f742 (0x5561bdfe7742 in /opt/conda/bin/python) 2022-09-27T16:04:33.2754157Z frame #20: _PyObject_Call + 0x20a (0x5561bdf9ffaa in /opt/conda/bin/python) 2022-09-27T16:04:33.2755116Z frame #21: _PyEval_EvalFrameDefault + 0x26e4 (0x5561be03b774 in /opt/conda/bin/python) 2022-09-27T16:04:33.2756064Z frame #22: + 0x18f742 (0x5561bdfe7742 in /opt/conda/bin/python) 2022-09-27T16:04:33.2756930Z frame #23: _PyObject_Call + 0x20a (0x5561bdf9ffaa in /opt/conda/bin/python) 2022-09-27T16:04:33.2758325Z frame #24: + 0xa53d8a (0x7f95ff6abd8a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2760201Z frame #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f95ff6a9fcd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2762589Z frame #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7f95ff6ad2a5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2765257Z frame #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7f95ff6aeae6 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2768176Z frame #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7f95f56d3b7c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2771206Z frame #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7f95ff6ad095 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2773322Z frame #30: + 0x47b3f43 (0x7f95f56ccf43 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2775540Z frame #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7f95f56cdad8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2778069Z frame #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7f95f56c7fd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2779921Z frame #33: + 0x47e3a02 (0x7f95f56fca02 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2781473Z frame #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7f95e774f93b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T16:04:33.2782756Z frame #35: + 0xc9039 (0x7f9616d73039 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-09-27T16:04:33.2784021Z frame #36: + 0x76db (0x7f96373236db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-09-27T16:04:33.2785157Z frame #37: clone + 0x3f (0x7f963704c61f in /lib/x86_64-linux-gnu/libc.so.6) 2022-09-27T16:04:33.2785654Z 2022-09-27T16:04:33.2785694Z 2022-09-27T16:04:33.2820277Z On WorkerInfo(id=3, name=worker3): 2022-09-27T16:04:33.2850527Z RuntimeError('Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!\nException raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fd89884750b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7fd898842ede in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xc83 (0x7fd8a2ed58c3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7fd8a2ed6d8f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7fd8a2ed84b2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7fd8a31b4d0e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x2a6997e (0x7fd89b4f297e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #7: + 0x2a69a86 (0x7fd89b4f2a86 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7fd8a3bf6b38 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: + 0x33b04ca (0x7fd8a53af4ca in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x33b0c39 (0x7fd8a53afc39 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7fd8a3c2cf22 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x2ff7e7 (0x7fd8b003d7e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #13: + 0x2ffb06 (0x7fd8b003db06 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #14: + 0x1ddc68 (0x561909dbcc68 in /opt/conda/bin/python)\nframe #15: + 0x199499 (0x561909d78499 in /opt/conda/bin/python)\nframe #16: + 0x1995fa (0x561909d785fa in /opt/conda/bin/python)\nframe #17: PyNumber_Add + 0x41 (0x561909d244b1 in /opt/conda/bin/python)\nframe #18: _PyEval_EvalFrameDefault + 0x1008 (0x561909dc1098 in /opt/conda/bin/python)\nframe #19: + 0x18f742 (0x561909d6e742 in /opt/conda/bin/python)\nframe #20: _PyObject_Call + 0x20a (0x561909d26faa in /opt/conda/bin/python)\nframe #21: _PyEval_EvalFrameDefault + 0x26e4 (0x561909dc2774 in /opt/conda/bin/python)\nframe #22: + 0x18f742 (0x561909d6e742 in /opt/conda/bin/python)\nframe #23: _PyObject_Call + 0x20a (0x561909d26faa in /opt/conda/bin/python)\nframe #24: + 0xa53d8a (0x7fd8b0791d8a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fd8b078ffcd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fd8b07932a5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7fd8b0794ae6 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7fd8a67b9b7c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fd8b0793095 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #30: + 0x47b3f43 (0x7fd8a67b2f43 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fd8a67b3ad8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fd8a67adfd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #33: + 0x47e3a02 (0x7fd8a67e2a02 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fd89883593b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #35: + 0xc9039 (0x7fd8c7e59039 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #36: + 0x76db (0x7fd8e84096db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #37: clone + 0x3f (0x7fd8e813261f in /lib/x86_64-linux-gnu/libc.so.6)\n') 2022-09-27T16:04:33.2868535Z Traceback (most recent call last): 2022-09-27T16:04:33.2869781Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 206, in _run_function 2022-09-27T16:04:33.2871136Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-09-27T16:04:33.2872582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/distributed/rpc/rpc_test.py", line 5911, in _gpu_add_wrong_gpus 2022-09-27T16:04:33.2873540Z return x.cpu() + y.cuda() 2022-09-27T16:04:33.2874464Z RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! 2022-09-27T16:04:33.2875738Z Exception raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first): 2022-09-27T16:04:33.2877707Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fd89884750b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T16:04:33.2880029Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7fd898842ede in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T16:04:33.2882175Z frame #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xc83 (0x7fd8a2ed58c3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2884238Z frame #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7fd8a2ed6d8f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2886413Z frame #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7fd8a2ed84b2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2888519Z frame #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7fd8a31b4d0e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2890241Z frame #6: + 0x2a6997e (0x7fd89b4f297e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-09-27T16:04:33.2891927Z frame #7: + 0x2a69a86 (0x7fd89b4f2a86 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-09-27T16:04:33.2893884Z frame #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7fd8a3bf6b38 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2895616Z frame #9: + 0x33b04ca (0x7fd8a53af4ca in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2897137Z frame #10: + 0x33b0c39 (0x7fd8a53afc39 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2898944Z frame #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7fd8a3c2cf22 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2900624Z frame #12: + 0x2ff7e7 (0x7fd8b003d7e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2902146Z frame #13: + 0x2ffb06 (0x7fd8b003db06 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2903206Z frame #14: + 0x1ddc68 (0x561909dbcc68 in /opt/conda/bin/python) 2022-09-27T16:04:33.2904131Z frame #15: + 0x199499 (0x561909d78499 in /opt/conda/bin/python) 2022-09-27T16:04:33.2905037Z frame #16: + 0x1995fa (0x561909d785fa in /opt/conda/bin/python) 2022-09-27T16:04:33.2905897Z frame #17: PyNumber_Add + 0x41 (0x561909d244b1 in /opt/conda/bin/python) 2022-09-27T16:04:33.2906815Z frame #18: _PyEval_EvalFrameDefault + 0x1008 (0x561909dc1098 in /opt/conda/bin/python) 2022-09-27T16:04:33.2907756Z frame #19: + 0x18f742 (0x561909d6e742 in /opt/conda/bin/python) 2022-09-27T16:04:33.2908641Z frame #20: _PyObject_Call + 0x20a (0x561909d26faa in /opt/conda/bin/python) 2022-09-27T16:04:33.2909548Z frame #21: _PyEval_EvalFrameDefault + 0x26e4 (0x561909dc2774 in /opt/conda/bin/python) 2022-09-27T16:04:33.2910497Z frame #22: + 0x18f742 (0x561909d6e742 in /opt/conda/bin/python) 2022-09-27T16:04:33.2911676Z frame #23: _PyObject_Call + 0x20a (0x561909d26faa in /opt/conda/bin/python) 2022-09-27T16:04:33.2913058Z frame #24: + 0xa53d8a (0x7fd8b0791d8a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2914923Z frame #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fd8b078ffcd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2917320Z frame #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fd8b07932a5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2920101Z frame #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7fd8b0794ae6 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2923035Z frame #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7fd8a67b9b7c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2926084Z frame #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fd8b0793095 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.2928167Z frame #30: + 0x47b3f43 (0x7fd8a67b2f43 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2930325Z frame #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fd8a67b3ad8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2932943Z frame #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fd8a67adfd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2934768Z frame #33: + 0x47e3a02 (0x7fd8a67e2a02 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.2936229Z frame #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fd89883593b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T16:04:33.2937301Z frame #35: + 0xc9039 (0x7fd8c7e59039 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-09-27T16:04:33.2938500Z frame #36: + 0x76db (0x7fd8e84096db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-09-27T16:04:33.2939626Z frame #37: clone + 0x3f (0x7fd8e813261f in /lib/x86_64-linux-gnu/libc.so.6) 2022-09-27T16:04:33.2940137Z 2022-09-27T16:04:33.2940168Z 2022-09-27T16:04:33.2940523Z On WorkerInfo(id=2, name=worker2): 2022-09-27T16:04:33.2970977Z RuntimeError('Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!\nException raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fbfa4e8d50b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7fbfa4e88ede in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xc83 (0x7fbfaf51b8c3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7fbfaf51cd8f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7fbfaf51e4b2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7fbfaf7fad0e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: + 0x2a6997e (0x7fbfa7b3897e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #7: + 0x2a69a86 (0x7fbfa7b38a86 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7fbfb023cb38 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: + 0x33b04ca (0x7fbfb19f54ca in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: + 0x33b0c39 (0x7fbfb19f5c39 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7fbfb0272f22 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: + 0x2ff7e7 (0x7fbfbc6837e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #13: + 0x2ffb06 (0x7fbfbc683b06 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #14: + 0x1ddc68 (0x564daa3acc68 in /opt/conda/bin/python)\nframe #15: + 0x199499 (0x564daa368499 in /opt/conda/bin/python)\nframe #16: + 0x1995fa (0x564daa3685fa in /opt/conda/bin/python)\nframe #17: PyNumber_Add + 0x41 (0x564daa3144b1 in /opt/conda/bin/python)\nframe #18: _PyEval_EvalFrameDefault + 0x1008 (0x564daa3b1098 in /opt/conda/bin/python)\nframe #19: + 0x18f742 (0x564daa35e742 in /opt/conda/bin/python)\nframe #20: _PyObject_Call + 0x20a (0x564daa316faa in /opt/conda/bin/python)\nframe #21: _PyEval_EvalFrameDefault + 0x26e4 (0x564daa3b2774 in /opt/conda/bin/python)\nframe #22: + 0x18f742 (0x564daa35e742 in /opt/conda/bin/python)\nframe #23: _PyObject_Call + 0x20a (0x564daa316faa in /opt/conda/bin/python)\nframe #24: + 0xa53d8a (0x7fbfbcdd7d8a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fbfbcdd5fcd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fbfbcdd92a5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7fbfbcddaae6 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7fbfb2dffb7c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fbfbcdd9095 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so)\nframe #30: + 0x47b3f43 (0x7fbfb2df8f43 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fbfb2df9ad8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fbfb2df3fd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #33: + 0x47e3a02 (0x7fbfb2e28a02 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)\nframe #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fbfa4e7b93b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)\nframe #35: + 0xc9039 (0x7fbfd449f039 in /opt/conda/bin/../lib/libstdc++.so.6)\nframe #36: + 0x76db (0x7fbff4a4f6db in /lib/x86_64-linux-gnu/libpthread.so.0)\nframe #37: clone + 0x3f (0x7fbff477861f in /lib/x86_64-linux-gnu/libc.so.6)\n') 2022-09-27T16:04:33.2989048Z Traceback (most recent call last): 2022-09-27T16:04:33.2990296Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/rpc/internal.py", line 206, in _run_function 2022-09-27T16:04:33.2991632Z result = python_udf.func(*python_udf.args, **python_udf.kwargs) 2022-09-27T16:04:33.2993072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/distributed/rpc/rpc_test.py", line 5911, in _gpu_add_wrong_gpus 2022-09-27T16:04:33.2993979Z return x.cpu() + y.cuda() 2022-09-27T16:04:33.2995094Z RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! 2022-09-27T16:04:33.2996374Z Exception raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:484 (most recent call first): 2022-09-27T16:04:33.2998370Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator >) + 0x6b (0x7fbfa4e8d50b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T16:04:33.3000704Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&) + 0xce (0x7fbfa4e88ede in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T16:04:33.3002863Z frame #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0xc83 (0x7fbfaf51b8c3 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.3004786Z frame #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7fbfaf51cd8f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.3006922Z frame #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf2 (0x7fbfaf51e4b2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.3009042Z frame #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2e (0x7fbfaf7fad0e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.3010779Z frame #6: + 0x2a6997e (0x7fbfa7b3897e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-09-27T16:04:33.3012311Z frame #7: + 0x2a69a86 (0x7fbfa7b38a86 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so) 2022-09-27T16:04:33.3014274Z frame #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x98 (0x7fbfb023cb38 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.3016041Z frame #9: + 0x33b04ca (0x7fbfb19f54ca in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.3017533Z frame #10: + 0x33b0c39 (0x7fbfb19f5c39 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.3019323Z frame #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x172 (0x7fbfb0272f22 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.3021002Z frame #12: + 0x2ff7e7 (0x7fbfbc6837e7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.3022547Z frame #13: + 0x2ffb06 (0x7fbfbc683b06 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.3023622Z frame #14: + 0x1ddc68 (0x564daa3acc68 in /opt/conda/bin/python) 2022-09-27T16:04:33.3024678Z frame #15: + 0x199499 (0x564daa368499 in /opt/conda/bin/python) 2022-09-27T16:04:33.3025610Z frame #16: + 0x1995fa (0x564daa3685fa in /opt/conda/bin/python) 2022-09-27T16:04:33.3026521Z frame #17: PyNumber_Add + 0x41 (0x564daa3144b1 in /opt/conda/bin/python) 2022-09-27T16:04:33.3027430Z frame #18: _PyEval_EvalFrameDefault + 0x1008 (0x564daa3b1098 in /opt/conda/bin/python) 2022-09-27T16:04:33.3028384Z frame #19: + 0x18f742 (0x564daa35e742 in /opt/conda/bin/python) 2022-09-27T16:04:33.3029280Z frame #20: _PyObject_Call + 0x20a (0x564daa316faa in /opt/conda/bin/python) 2022-09-27T16:04:33.3030219Z frame #21: _PyEval_EvalFrameDefault + 0x26e4 (0x564daa3b2774 in /opt/conda/bin/python) 2022-09-27T16:04:33.3031415Z frame #22: + 0x18f742 (0x564daa35e742 in /opt/conda/bin/python) 2022-09-27T16:04:33.3032447Z frame #23: _PyObject_Call + 0x20a (0x564daa316faa in /opt/conda/bin/python) 2022-09-27T16:04:33.3033883Z frame #24: + 0xa53d8a (0x7fbfbcdd7d8a in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.3035753Z frame #25: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7fbfbcdd5fcd in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.3038186Z frame #26: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector >, bool) const + 0x85 (0x7fbfbcdd92a5 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.3040883Z frame #27: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector >) const + 0x96 (0x7fbfbcddaae6 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.3043826Z frame #28: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x10c (0x7fbfb2dffb7c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.3046942Z frame #29: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector >) const + 0x65 (0x7fbfbcdd9095 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_python.so) 2022-09-27T16:04:33.3049070Z frame #30: + 0x47b3f43 (0x7fbfb2df8f43 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.3051274Z frame #31: torch::distributed::rpc::RequestCallbackNoPython::processMessage(torch::distributed::rpc::Message&, std::vector >) const + 0x538 (0x7fbfb2df9ad8 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.3053925Z frame #32: torch::distributed::rpc::RequestCallback::operator()(torch::distributed::rpc::Message&, std::vector >) const + 0x57 (0x7fbfb2df3fd7 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.3055804Z frame #33: + 0x47e3a02 (0x7fbfb2e28a02 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so) 2022-09-27T16:04:33.3057411Z frame #34: c10::ThreadPool::main_loop(unsigned long) + 0x2db (0x7fbfa4e7b93b in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so) 2022-09-27T16:04:33.3058561Z frame #35: + 0xc9039 (0x7fbfd449f039 in /opt/conda/bin/../lib/libstdc++.so.6) 2022-09-27T16:04:33.3059834Z frame #36: + 0x76db (0x7fbff4a4f6db in /lib/x86_64-linux-gnu/libpthread.so.0) 2022-09-27T16:04:33.3061002Z frame #37: clone + 0x3f (0x7fbff477861f in /lib/x86_64-linux-gnu/libc.so.6) 2022-09-27T16:04:33.3061508Z 2022-09-27T16:04:33.3061544Z 2022-09-27T16:04:33.9268190Z ok (7.501s) 2022-09-27T16:04:33.9268421Z 2022-09-27T16:04:33.9268813Z ---------------------------------------------------------------------- 2022-09-27T16:04:33.9269151Z Ran 1 test in 7.501s 2022-09-27T16:04:33.9269298Z 2022-09-27T16:04:33.9269397Z OK 2022-09-27T16:04:33.9269532Z 2022-09-27T16:04:33.9269665Z Generating XML reports... 2022-09-27T16:04:33.9306595Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160426.xml 2022-09-27T16:04:35.9202644Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:35.9203141Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:35.9205963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:35.9206738Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:36.1552456Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmijico69 2022-09-27T16:04:36.1554608Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmijico69/_remote_module_non_scriptable.py 2022-09-27T16:04:36.5951934Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:04:36.5968296Z 2022-09-27T16:04:36.5968557Z Running tests... 2022-09-27T16:04:36.5968979Z ---------------------------------------------------------------------- 2022-09-27T16:04:38.0860207Z test_devices_option_mismatch (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:04:38.1045788Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27266 2022-09-27T16:04:38.1052599Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27267 2022-09-27T16:04:38.1059261Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 27268 2022-09-27T16:04:38.1066191Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 27269 2022-09-27T16:04:39.7966912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:39.7967490Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:39.7968616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:39.7969355Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:39.8121693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:39.8122182Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:39.8126269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:39.8126862Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:39.8174975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:39.8175705Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:39.8179139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:39.8179860Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:39.8451597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:39.8452055Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:39.8455507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:39.8456193Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:40.0585973Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprmzixklo 2022-09-27T16:04:40.0586568Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprmzixklo/_remote_module_non_scriptable.py 2022-09-27T16:04:40.0587092Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpju04w7ta 2022-09-27T16:04:40.0589499Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpju04w7ta/_remote_module_non_scriptable.py 2022-09-27T16:04:40.0656936Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp98q_1m7 2022-09-27T16:04:40.0659699Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp98q_1m7/_remote_module_non_scriptable.py 2022-09-27T16:04:40.0671405Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp702mbq72 2022-09-27T16:04:40.0674305Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp702mbq72/_remote_module_non_scriptable.py 2022-09-27T16:04:40.5082171Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:04:40.5096849Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:04:40.5098032Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:04:40.5194134Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:04:40.6303235Z fi_getinfo: -61 2022-09-27T16:04:40.6318780Z fi_getinfo: -61 2022-09-27T16:04:40.6323767Z fi_getinfo: -61 2022-09-27T16:04:40.6410418Z fi_getinfo: -61 2022-09-27T16:04:41.1139797Z ok (4.517s) 2022-09-27T16:04:41.1140088Z 2022-09-27T16:04:41.1140486Z ---------------------------------------------------------------------- 2022-09-27T16:04:41.1140826Z Ran 1 test in 4.517s 2022-09-27T16:04:41.1140990Z 2022-09-27T16:04:41.1141083Z OK 2022-09-27T16:04:41.1141219Z 2022-09-27T16:04:41.1141349Z Generating XML reports... 2022-09-27T16:04:41.1178235Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160436.xml 2022-09-27T16:04:43.0827219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:43.0827729Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:43.0830294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:43.0831032Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:43.3121193Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvwgyh_c_ 2022-09-27T16:04:43.3122964Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvwgyh_c_/_remote_module_non_scriptable.py 2022-09-27T16:04:43.7485987Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:04:43.7501389Z 2022-09-27T16:04:43.7501809Z Running tests... 2022-09-27T16:04:43.7502274Z ---------------------------------------------------------------------- 2022-09-27T16:04:45.2209909Z test_devices_option_mismatch_reverse (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:04:45.2388963Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27453 2022-09-27T16:04:45.2396088Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27454 2022-09-27T16:04:45.2403575Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 27455 2022-09-27T16:04:45.2410811Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 27456 2022-09-27T16:04:46.8489907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:46.8491115Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:46.8491773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:46.8492245Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:46.8868260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:46.8868718Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:46.8871780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:46.8872263Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:46.9013695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:46.9014143Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:46.9018631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:46.9019113Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:46.9095162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:46.9095609Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:46.9099740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:46.9100213Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:47.0846178Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg87nlknt 2022-09-27T16:04:47.0846810Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg87nlknt/_remote_module_non_scriptable.py 2022-09-27T16:04:47.1189008Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg83rjgdt 2022-09-27T16:04:47.1191915Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg83rjgdt/_remote_module_non_scriptable.py 2022-09-27T16:04:47.1374057Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3u1imy5l 2022-09-27T16:04:47.1376855Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3u1imy5l/_remote_module_non_scriptable.py 2022-09-27T16:04:47.1398051Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_8qcxe0s 2022-09-27T16:04:47.1401921Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_8qcxe0s/_remote_module_non_scriptable.py 2022-09-27T16:04:47.5143469Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:04:47.5647058Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:04:47.5916617Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:04:47.5925548Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:04:47.6362627Z fi_getinfo: -61 2022-09-27T16:04:47.6864289Z fi_getinfo: -61 2022-09-27T16:04:47.7135764Z fi_getinfo: -61 2022-09-27T16:04:47.7143806Z fi_getinfo: -61 2022-09-27T16:04:48.2483317Z ok (4.498s) 2022-09-27T16:04:48.2483676Z 2022-09-27T16:04:48.2484491Z ---------------------------------------------------------------------- 2022-09-27T16:04:48.2485061Z Ran 1 test in 4.498s 2022-09-27T16:04:48.2485231Z 2022-09-27T16:04:48.2485330Z OK 2022-09-27T16:04:48.2485470Z 2022-09-27T16:04:48.2485611Z Generating XML reports... 2022-09-27T16:04:48.2520156Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160443.xml 2022-09-27T16:04:50.2464028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:50.2464561Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:50.2465366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:50.2465850Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:50.4896561Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcrifrqk4 2022-09-27T16:04:50.4897393Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcrifrqk4/_remote_module_non_scriptable.py 2022-09-27T16:04:50.9333273Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:04:50.9348702Z 2022-09-27T16:04:50.9349082Z Running tests... 2022-09-27T16:04:50.9349523Z ---------------------------------------------------------------------- 2022-09-27T16:04:52.4536498Z test_owner_rref_forward_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:04:52.4721462Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27640 2022-09-27T16:04:52.4728433Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27641 2022-09-27T16:04:52.4734598Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 27642 2022-09-27T16:04:52.4741465Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 27643 2022-09-27T16:04:54.0678762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:54.0679701Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:54.0680904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:54.0681865Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:54.0818059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:54.0819005Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:54.0821615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:54.0822574Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:54.1048695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:54.1049587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:54.1053574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:54.1054570Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:54.1550227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:04:54.1551702Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:04:54.1552941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:04:54.1553903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:04:54.3168850Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps5p3_im0 2022-09-27T16:04:54.3170252Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps5p3_im0/_remote_module_non_scriptable.py 2022-09-27T16:04:54.3181496Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiouttfer 2022-09-27T16:04:54.3184698Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiouttfer/_remote_module_non_scriptable.py 2022-09-27T16:04:54.3327118Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1bshgnrf 2022-09-27T16:04:54.3330046Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1bshgnrf/_remote_module_non_scriptable.py 2022-09-27T16:04:54.3896990Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph9qjj4iv 2022-09-27T16:04:54.3899769Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph9qjj4iv/_remote_module_non_scriptable.py 2022-09-27T16:04:54.7616430Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:04:54.7661073Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:04:54.7742663Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:04:54.8502454Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:04:54.8962494Z fi_getinfo: -61 2022-09-27T16:05:01.1911096Z ok (10.256s) 2022-09-27T16:05:01.1911506Z 2022-09-27T16:05:01.1911947Z ---------------------------------------------------------------------- 2022-09-27T16:05:01.1912307Z Ran 1 test in 10.256s 2022-09-27T16:05:01.1912473Z 2022-09-27T16:05:01.1912570Z OK 2022-09-27T16:05:01.1912705Z 2022-09-27T16:05:01.1912819Z Generating XML reports... 2022-09-27T16:05:01.1948832Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160450.xml 2022-09-27T16:05:03.1853742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:03.1854265Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:03.1855639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:03.1856146Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:03.4291278Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprvxn0df7 2022-09-27T16:05:03.4292445Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprvxn0df7/_remote_module_non_scriptable.py 2022-09-27T16:05:03.8758666Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:05:03.8774233Z 2022-09-27T16:05:03.8774358Z Running tests... 2022-09-27T16:05:03.8775361Z ---------------------------------------------------------------------- 2022-09-27T16:05:05.3710097Z test_owner_rref_forward_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:05:05.3894965Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27855 2022-09-27T16:05:05.3901758Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27856 2022-09-27T16:05:05.3908273Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 27857 2022-09-27T16:05:05.3915706Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 27858 2022-09-27T16:05:06.9877914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:06.9878409Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:06.9880499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:06.9880993Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:06.9941991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:06.9942430Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:06.9945830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:06.9946322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:07.0111716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:07.0112181Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:07.0115820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:07.0116299Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:07.0116881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:07.0117307Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:07.0121243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:07.0121886Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:07.2331798Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbbt4q8id 2022-09-27T16:05:07.2333092Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbbt4q8id/_remote_module_non_scriptable.py 2022-09-27T16:05:07.2422160Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphq4jcgq3 2022-09-27T16:05:07.2424948Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphq4jcgq3/_remote_module_non_scriptable.py 2022-09-27T16:05:07.2482533Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprnf3xchu 2022-09-27T16:05:07.2485370Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprnf3xchu/_remote_module_non_scriptable.py 2022-09-27T16:05:07.2581389Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqb3j53b0 2022-09-27T16:05:07.2584491Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqb3j53b0/_remote_module_non_scriptable.py 2022-09-27T16:05:07.6877472Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:05:07.6942752Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:05:07.6960512Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:05:07.7073134Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:05:07.8402543Z fi_getinfo: -61 2022-09-27T16:05:15.7111349Z ok (11.833s) 2022-09-27T16:05:15.7111769Z 2022-09-27T16:05:15.7112416Z ---------------------------------------------------------------------- 2022-09-27T16:05:15.7113025Z Ran 1 test in 11.833s 2022-09-27T16:05:15.7113292Z 2022-09-27T16:05:15.7113452Z OK 2022-09-27T16:05:15.7113693Z 2022-09-27T16:05:15.7113946Z Generating XML reports... 2022-09-27T16:05:15.7149957Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160503.xml 2022-09-27T16:05:17.7117303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:17.7117801Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:17.7119566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:17.7120049Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:17.9492335Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk_ekz0jh 2022-09-27T16:05:17.9494119Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk_ekz0jh/_remote_module_non_scriptable.py 2022-09-27T16:05:18.3883183Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:05:18.3899175Z 2022-09-27T16:05:18.3899568Z Running tests... 2022-09-27T16:05:18.3900042Z ---------------------------------------------------------------------- 2022-09-27T16:05:19.8953583Z test_owner_rref_forward_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:05:19.9130175Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28071 2022-09-27T16:05:19.9135910Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28072 2022-09-27T16:05:19.9142292Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 28073 2022-09-27T16:05:19.9148837Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 28074 2022-09-27T16:05:21.4992887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:21.4993669Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:21.4995531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:21.4996024Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:21.5050225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:21.5050689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:21.5054196Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:21.5054684Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:21.5293320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:21.5293775Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:21.5297304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:21.5297768Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:21.5877249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:21.5877940Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:21.5879419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:21.5879874Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:21.7382740Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0ag473cm 2022-09-27T16:05:21.7383494Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0ag473cm/_remote_module_non_scriptable.py 2022-09-27T16:05:21.7433191Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp73vkyagz 2022-09-27T16:05:21.7436191Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp73vkyagz/_remote_module_non_scriptable.py 2022-09-27T16:05:21.7515609Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5am0114b 2022-09-27T16:05:21.7518345Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5am0114b/_remote_module_non_scriptable.py 2022-09-27T16:05:21.8145333Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppon502vl 2022-09-27T16:05:21.8147798Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppon502vl/_remote_module_non_scriptable.py 2022-09-27T16:05:22.1887792Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:05:22.1902762Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:05:22.1912289Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:05:22.2756103Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:05:22.3121985Z fi_getinfo: -61 2022-09-27T16:05:30.4347582Z ok (12.044s) 2022-09-27T16:05:30.4347863Z 2022-09-27T16:05:30.4348250Z ---------------------------------------------------------------------- 2022-09-27T16:05:30.4348581Z Ran 1 test in 12.045s 2022-09-27T16:05:30.4348800Z 2022-09-27T16:05:30.4348898Z OK 2022-09-27T16:05:30.4349033Z 2022-09-27T16:05:30.4349168Z Generating XML reports... 2022-09-27T16:05:30.4384057Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160518.xml 2022-09-27T16:05:32.4184386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:32.4187139Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:32.4188102Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:32.4188598Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:32.6457024Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl1ml184v 2022-09-27T16:05:32.6458584Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl1ml184v/_remote_module_non_scriptable.py 2022-09-27T16:05:33.0723550Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:05:33.0738401Z 2022-09-27T16:05:33.0738667Z Running tests... 2022-09-27T16:05:33.0739106Z ---------------------------------------------------------------------- 2022-09-27T16:05:34.5326926Z test_owner_rref_forward_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:05:34.5503817Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28287 2022-09-27T16:05:34.5509885Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28288 2022-09-27T16:05:34.5516885Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 28289 2022-09-27T16:05:34.5523620Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 28290 2022-09-27T16:05:36.1378700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:36.1379214Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:36.1381002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:36.1381486Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:36.1416300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:36.1416760Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:36.1420344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:36.1420828Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:36.1889550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:36.1889998Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:36.1893233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:36.1893707Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:36.2307000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:36.2307579Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:36.2310205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:36.2311134Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:36.3634401Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoftcf3x5 2022-09-27T16:05:36.3635897Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoftcf3x5/_remote_module_non_scriptable.py 2022-09-27T16:05:36.3768997Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2gr7hn80 2022-09-27T16:05:36.3771874Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2gr7hn80/_remote_module_non_scriptable.py 2022-09-27T16:05:36.4063613Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp121dpndr 2022-09-27T16:05:36.4066292Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp121dpndr/_remote_module_non_scriptable.py 2022-09-27T16:05:36.4533802Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp59vdubjk 2022-09-27T16:05:36.4536658Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp59vdubjk/_remote_module_non_scriptable.py 2022-09-27T16:05:36.8099454Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:05:36.8184954Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:05:36.8355444Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:05:36.9143480Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:05:37.0363316Z fi_getinfo: -61 2022-09-27T16:05:43.3691603Z ok (10.295s) 2022-09-27T16:05:43.3691953Z 2022-09-27T16:05:43.3692599Z ---------------------------------------------------------------------- 2022-09-27T16:05:43.3693215Z Ran 1 test in 10.295s 2022-09-27T16:05:43.3693524Z 2022-09-27T16:05:43.3693674Z OK 2022-09-27T16:05:43.3693916Z 2022-09-27T16:05:43.3694135Z Generating XML reports... 2022-09-27T16:05:43.3730918Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160533.xml 2022-09-27T16:05:45.3543063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:45.3543573Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:45.3546507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:45.3546990Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:45.5865712Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0a83fodv 2022-09-27T16:05:45.5867026Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0a83fodv/_remote_module_non_scriptable.py 2022-09-27T16:05:46.0121595Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:05:46.0136621Z 2022-09-27T16:05:46.0136910Z Running tests... 2022-09-27T16:05:46.0137339Z ---------------------------------------------------------------------- 2022-09-27T16:05:47.4774741Z test_rref_as_arg_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:05:47.4951461Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28502 2022-09-27T16:05:47.4958346Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28503 2022-09-27T16:05:47.4964793Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 28504 2022-09-27T16:05:47.4971098Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 28505 2022-09-27T16:05:49.0804987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:49.0805984Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:49.0807537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:49.0808482Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:49.0959265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:49.0960157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:49.0964510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:49.0965479Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:49.1238113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:49.1239385Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:49.1241172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:49.1242117Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:49.1728154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:05:49.1729009Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:05:49.1730428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:05:49.1731227Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:05:49.3338504Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu6u0weu6 2022-09-27T16:05:49.3339568Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyju29zu5 2022-09-27T16:05:49.3340606Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu6u0weu6/_remote_module_non_scriptable.py 2022-09-27T16:05:49.3343065Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyju29zu5/_remote_module_non_scriptable.py 2022-09-27T16:05:49.3423620Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0v_tiek_ 2022-09-27T16:05:49.3425818Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0v_tiek_/_remote_module_non_scriptable.py 2022-09-27T16:05:49.4011189Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4sr0o3b_ 2022-09-27T16:05:49.4013183Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4sr0o3b_/_remote_module_non_scriptable.py 2022-09-27T16:05:49.7809406Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:05:49.7842860Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:05:49.7895105Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:05:49.8431733Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:05:49.9029210Z fi_getinfo: -61 2022-09-27T16:05:49.9055408Z fi_getinfo: -61 2022-09-27T16:05:49.9125696Z fi_getinfo: -61 2022-09-27T16:05:49.9646068Z fi_getinfo: -61 2022-09-27T16:06:03.9339732Z ok (17.920s) 2022-09-27T16:06:03.9340065Z 2022-09-27T16:06:03.9342383Z ---------------------------------------------------------------------- 2022-09-27T16:06:03.9342789Z Ran 1 test in 17.920s 2022-09-27T16:06:03.9342966Z 2022-09-27T16:06:03.9343066Z OK 2022-09-27T16:06:03.9343210Z 2022-09-27T16:06:03.9343329Z Generating XML reports... 2022-09-27T16:06:03.9380054Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160546.xml 2022-09-27T16:06:05.8827771Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:06:05.8828282Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:06:05.8830372Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:06:05.8831394Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:06:06.1137764Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0llsi1d6 2022-09-27T16:06:06.1138916Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0llsi1d6/_remote_module_non_scriptable.py 2022-09-27T16:06:06.5427633Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:06:06.5443894Z 2022-09-27T16:06:06.5444354Z Running tests... 2022-09-27T16:06:06.5444859Z ---------------------------------------------------------------------- 2022-09-27T16:06:07.9979749Z test_rref_as_arg_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:06:08.0158656Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28849 2022-09-27T16:06:08.0165284Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28850 2022-09-27T16:06:08.0171773Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 28851 2022-09-27T16:06:08.0177989Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 28852 2022-09-27T16:06:09.6069070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:06:09.6069834Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:06:09.6070998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:06:09.6071511Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:06:09.6076474Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:06:09.6076943Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:06:09.6081162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:06:09.6081642Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:06:09.6357837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:06:09.6358302Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:06:09.6361486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:06:09.6361979Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:06:09.6838482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:06:09.6838986Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:06:09.6841163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:06:09.6841650Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:06:09.8351632Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb8liobk7 2022-09-27T16:06:09.8352975Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb8liobk7/_remote_module_non_scriptable.py 2022-09-27T16:06:09.8503929Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6k82xlco 2022-09-27T16:06:09.8506720Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6k82xlco/_remote_module_non_scriptable.py 2022-09-27T16:06:09.8561977Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqamu4j4v 2022-09-27T16:06:09.8564761Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqamu4j4v/_remote_module_non_scriptable.py 2022-09-27T16:06:09.9102761Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpymtd6abw 2022-09-27T16:06:09.9105371Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpymtd6abw/_remote_module_non_scriptable.py 2022-09-27T16:06:10.2791931Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:06:10.2923999Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:06:10.2936333Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:06:10.3550320Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:06:10.4009460Z fi_getinfo: -61 2022-09-27T16:06:10.4138824Z fi_getinfo: -61 2022-09-27T16:06:10.4151222Z fi_getinfo: -61 2022-09-27T16:06:10.4768172Z fi_getinfo: -61 2022-09-27T16:06:26.3525612Z ok (19.808s) 2022-09-27T16:06:26.3525822Z 2022-09-27T16:06:26.3528019Z ---------------------------------------------------------------------- 2022-09-27T16:06:26.3528699Z Ran 1 test in 19.808s 2022-09-27T16:06:26.3528875Z 2022-09-27T16:06:26.3528952Z OK 2022-09-27T16:06:26.3531720Z 2022-09-27T16:06:26.3532289Z Generating XML reports... 2022-09-27T16:06:26.3563030Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160606.xml 2022-09-27T16:06:28.3289831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:06:28.3290342Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:06:28.3294093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:06:28.3294572Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:06:28.5679447Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9d_rjvm8 2022-09-27T16:06:28.5680407Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9d_rjvm8/_remote_module_non_scriptable.py 2022-09-27T16:06:29.0093090Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:06:29.0108494Z 2022-09-27T16:06:29.0108804Z Running tests... 2022-09-27T16:06:29.0109239Z ---------------------------------------------------------------------- 2022-09-27T16:06:30.4858047Z test_rref_as_arg_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:06:30.5016522Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/81962 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.491s) 2022-09-27T16:06:30.5017256Z 2022-09-27T16:06:30.5017549Z ---------------------------------------------------------------------- 2022-09-27T16:06:30.5017884Z Ran 1 test in 1.491s 2022-09-27T16:06:30.5018030Z 2022-09-27T16:06:30.5018137Z OK (skipped=1) 2022-09-27T16:06:30.5018622Z 2022-09-27T16:06:30.5018775Z Generating XML reports... 2022-09-27T16:06:30.5052310Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160629.xml 2022-09-27T16:06:32.4705180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:06:32.4705671Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:06:32.4707254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:06:32.4707772Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:06:32.7062942Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_s9aops2 2022-09-27T16:06:32.7064196Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_s9aops2/_remote_module_non_scriptable.py 2022-09-27T16:06:33.1445343Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:06:33.1460935Z 2022-09-27T16:06:33.1461484Z Running tests... 2022-09-27T16:06:33.1461958Z ---------------------------------------------------------------------- 2022-09-27T16:06:34.6282227Z test_rref_as_arg_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:06:34.6457511Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29236 2022-09-27T16:06:34.6464078Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29237 2022-09-27T16:06:34.6470097Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 29238 2022-09-27T16:06:34.6476937Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 29239 2022-09-27T16:06:36.2409662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:06:36.2410167Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:06:36.2411744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:06:36.2412218Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:06:36.3258517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:06:36.3259006Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:06:36.3261069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:06:36.3261563Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:06:36.3311969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:06:36.3312429Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:06:36.3316327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:06:36.3316809Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:06:36.3662397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:06:36.3662892Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:06:36.3665705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:06:36.3666206Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:06:36.4823693Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6faqnb5w 2022-09-27T16:06:36.4825416Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6faqnb5w/_remote_module_non_scriptable.py 2022-09-27T16:06:36.5495079Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbj475o7h 2022-09-27T16:06:36.5496694Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbj475o7h/_remote_module_non_scriptable.py 2022-09-27T16:06:36.5560717Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk3kj0pmq 2022-09-27T16:06:36.5562833Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk3kj0pmq/_remote_module_non_scriptable.py 2022-09-27T16:06:36.5908914Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkpqfnlgu 2022-09-27T16:06:36.5911274Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkpqfnlgu/_remote_module_non_scriptable.py 2022-09-27T16:06:36.9174983Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:06:36.9846749Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:06:36.9898346Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:06:37.0340791Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:06:37.0385781Z fi_getinfo: -61 2022-09-27T16:06:37.1068231Z fi_getinfo: -61 2022-09-27T16:06:37.1117533Z fi_getinfo: -61 2022-09-27T16:06:37.1556587Z fi_getinfo: -61 2022-09-27T16:06:53.0848723Z ok (19.938s) 2022-09-27T16:06:53.0848946Z 2022-09-27T16:06:53.0849352Z ---------------------------------------------------------------------- 2022-09-27T16:06:53.0853501Z Ran 1 test in 19.939s 2022-09-27T16:06:53.0853988Z 2022-09-27T16:06:53.0854271Z OK 2022-09-27T16:06:53.0854419Z 2022-09-27T16:06:53.0854569Z Generating XML reports... 2022-09-27T16:06:53.0888573Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160633.xml 2022-09-27T16:06:55.0284181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:06:55.0284685Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:06:55.0286075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:06:55.0286530Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:06:55.2589147Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpldgzgzlw 2022-09-27T16:06:55.2590318Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpldgzgzlw/_remote_module_non_scriptable.py 2022-09-27T16:06:55.6877912Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:06:55.6894108Z 2022-09-27T16:06:55.6894515Z Running tests... 2022-09-27T16:06:55.6895009Z ---------------------------------------------------------------------- 2022-09-27T16:06:57.1562807Z test_rref_as_arg_synchronization5 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:06:57.1746520Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29589 2022-09-27T16:06:57.1753223Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29590 2022-09-27T16:06:57.1760299Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 29591 2022-09-27T16:06:57.1766714Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 29592 2022-09-27T16:06:58.7696041Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:06:58.7697075Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:06:58.7698342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:06:58.7699284Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:06:58.7793508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:06:58.7794402Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:06:58.7797403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:06:58.7798337Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:06:58.7811002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:06:58.7811942Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:06:58.7815578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:06:58.7816575Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:06:58.7970710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:06:58.7971639Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:06:58.7975944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:06:58.7976932Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:06:59.0212767Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprbm1uiev 2022-09-27T16:06:59.0214247Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprbm1uiev/_remote_module_non_scriptable.py 2022-09-27T16:06:59.0223750Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzxl0sbam 2022-09-27T16:06:59.0226004Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzxl0sbam/_remote_module_non_scriptable.py 2022-09-27T16:06:59.0261499Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp95_gwqn_ 2022-09-27T16:06:59.0263643Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp95_gwqn_/_remote_module_non_scriptable.py 2022-09-27T16:06:59.0324254Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph47aeqdc 2022-09-27T16:06:59.0327134Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph47aeqdc/_remote_module_non_scriptable.py 2022-09-27T16:06:59.4716993Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:06:59.4746120Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:06:59.4786991Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:06:59.4862120Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:06:59.5939417Z fi_getinfo: -61 2022-09-27T16:06:59.5961014Z fi_getinfo: -61 2022-09-27T16:06:59.5997960Z fi_getinfo: -61 2022-09-27T16:06:59.6076910Z fi_getinfo: -61 2022-09-27T16:07:13.2080966Z ok (17.518s) 2022-09-27T16:07:13.2081179Z 2022-09-27T16:07:13.2083215Z ---------------------------------------------------------------------- 2022-09-27T16:07:13.2083598Z Ran 1 test in 17.519s 2022-09-27T16:07:13.2083763Z 2022-09-27T16:07:13.2083856Z OK 2022-09-27T16:07:13.2085375Z 2022-09-27T16:07:13.2085753Z Generating XML reports... 2022-09-27T16:07:13.2118684Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160655.xml 2022-09-27T16:07:15.1717530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:07:15.1718067Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:07:15.1719094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:07:15.1719553Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:07:15.4050529Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsk2tdm_5 2022-09-27T16:07:15.4051858Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsk2tdm_5/_remote_module_non_scriptable.py 2022-09-27T16:07:15.8358265Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:07:15.8373981Z 2022-09-27T16:07:15.8374382Z Running tests... 2022-09-27T16:07:15.8374815Z ---------------------------------------------------------------------- 2022-09-27T16:07:17.3172392Z test_rref_forward_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:07:17.3349777Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29936 2022-09-27T16:07:17.3356869Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29937 2022-09-27T16:07:17.3363429Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 29938 2022-09-27T16:07:17.3370427Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 29939 2022-09-27T16:07:18.9212003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:07:18.9212511Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:07:18.9213734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:07:18.9214646Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:07:18.9269003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:07:18.9269476Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:07:18.9273032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:07:18.9273513Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:07:18.9847731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:07:18.9848191Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:07:18.9851505Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:07:18.9852003Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:07:19.0246734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:07:19.0247219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:07:19.0248646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:07:19.0249111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:07:19.1504280Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgc3yx1_s 2022-09-27T16:07:19.1504898Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgc3yx1_s/_remote_module_non_scriptable.py 2022-09-27T16:07:19.1635161Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnp_5e585 2022-09-27T16:07:19.1637717Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnp_5e585/_remote_module_non_scriptable.py 2022-09-27T16:07:19.2032874Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp87epx2z3 2022-09-27T16:07:19.2035549Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp87epx2z3/_remote_module_non_scriptable.py 2022-09-27T16:07:19.2543174Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzgsj3yrn 2022-09-27T16:07:19.2545965Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzgsj3yrn/_remote_module_non_scriptable.py 2022-09-27T16:07:19.5850967Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:07:19.5962600Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:07:19.6309124Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:07:19.6991120Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:07:19.7066705Z fi_getinfo: -61 2022-09-27T16:07:19.7176422Z fi_getinfo: -61 2022-09-27T16:07:19.7520318Z fi_getinfo: -61 2022-09-27T16:07:19.8208464Z fi_getinfo: -61 2022-09-27T16:07:32.2710260Z ok (16.433s) 2022-09-27T16:07:32.2710991Z 2022-09-27T16:07:32.2711445Z ---------------------------------------------------------------------- 2022-09-27T16:07:32.2711778Z Ran 1 test in 16.434s 2022-09-27T16:07:32.2711941Z 2022-09-27T16:07:32.2712042Z OK 2022-09-27T16:07:32.2712177Z 2022-09-27T16:07:32.2712335Z Generating XML reports... 2022-09-27T16:07:32.2748339Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160715.xml 2022-09-27T16:07:34.2557538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:07:34.2558042Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:07:34.2560216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:07:34.2561017Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:07:34.4920252Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphmep4wfd 2022-09-27T16:07:34.4921204Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphmep4wfd/_remote_module_non_scriptable.py 2022-09-27T16:07:34.9336501Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:07:34.9351953Z 2022-09-27T16:07:34.9352139Z Running tests... 2022-09-27T16:07:34.9352574Z ---------------------------------------------------------------------- 2022-09-27T16:07:36.4045380Z test_rref_forward_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:07:36.4221905Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30282 2022-09-27T16:07:36.4228851Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30283 2022-09-27T16:07:36.4236656Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 30284 2022-09-27T16:07:36.4243251Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 30285 2022-09-27T16:07:38.0072523Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:07:38.0073051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:07:38.0074240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:07:38.0074698Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:07:38.0132624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:07:38.0133089Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:07:38.0137466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:07:38.0137930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:07:38.0490100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:07:38.0490568Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:07:38.0494528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:07:38.0494988Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:07:38.0951904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:07:38.0952382Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:07:38.0954569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:07:38.0955346Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:07:38.2363148Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuzwtywkx 2022-09-27T16:07:38.2364420Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuzwtywkx/_remote_module_non_scriptable.py 2022-09-27T16:07:38.2528518Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp58zsyzia 2022-09-27T16:07:38.2530862Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp58zsyzia/_remote_module_non_scriptable.py 2022-09-27T16:07:38.2739125Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpitnrmh_i 2022-09-27T16:07:38.2741503Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpitnrmh_i/_remote_module_non_scriptable.py 2022-09-27T16:07:38.3214337Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphz8po5ba 2022-09-27T16:07:38.3216411Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphz8po5ba/_remote_module_non_scriptable.py 2022-09-27T16:07:38.6868880Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:07:38.7015505Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:07:38.7145117Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:07:38.7704858Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:07:38.8084326Z fi_getinfo: -61 2022-09-27T16:07:38.8230589Z fi_getinfo: -61 2022-09-27T16:07:38.8359958Z fi_getinfo: -61 2022-09-27T16:07:38.8922401Z fi_getinfo: -61 2022-09-27T16:07:51.4587028Z ok (16.523s) 2022-09-27T16:07:51.4587265Z 2022-09-27T16:07:51.4587647Z ---------------------------------------------------------------------- 2022-09-27T16:07:51.4588027Z Ran 1 test in 16.523s 2022-09-27T16:07:51.4588191Z 2022-09-27T16:07:51.4588286Z OK 2022-09-27T16:07:51.4590193Z 2022-09-27T16:07:51.4591551Z Generating XML reports... 2022-09-27T16:07:51.4629729Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160734.xml 2022-09-27T16:07:53.4389481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:07:53.4389999Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:07:53.4391691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:07:53.4392148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:07:53.6688102Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx73pmgav 2022-09-27T16:07:53.6689326Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx73pmgav/_remote_module_non_scriptable.py 2022-09-27T16:07:54.0941436Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:07:54.0956528Z 2022-09-27T16:07:54.0957044Z Running tests... 2022-09-27T16:07:54.0957545Z ---------------------------------------------------------------------- 2022-09-27T16:07:55.5405385Z test_rref_forward_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:07:55.5582325Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30631 2022-09-27T16:07:55.5588827Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30632 2022-09-27T16:07:55.5595539Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 30633 2022-09-27T16:07:55.5601983Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 30634 2022-09-27T16:07:57.1444088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:07:57.1444934Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:07:57.1445751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:07:57.1446230Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:07:57.1447087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:07:57.1447539Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:07:57.1448422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:07:57.1448886Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:07:57.1617858Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:07:57.1618469Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:07:57.1621613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:07:57.1622096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:07:57.1726514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:07:57.1726964Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:07:57.1730892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:07:57.1731366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:07:57.4130014Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu6y3fzlo 2022-09-27T16:07:57.4131183Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz6yo2udh 2022-09-27T16:07:57.4132027Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu6y3fzlo/_remote_module_non_scriptable.py 2022-09-27T16:07:57.4134163Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz6yo2udh/_remote_module_non_scriptable.py 2022-09-27T16:07:57.4147223Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo2zq79wx 2022-09-27T16:07:57.4150057Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo2zq79wx/_remote_module_non_scriptable.py 2022-09-27T16:07:57.4200362Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw9gwghmx 2022-09-27T16:07:57.4203441Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw9gwghmx/_remote_module_non_scriptable.py 2022-09-27T16:07:57.8611282Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:07:57.8615648Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:07:57.8622114Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:07:57.8726723Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:07:57.9836751Z fi_getinfo: -61 2022-09-27T16:07:57.9840767Z fi_getinfo: -61 2022-09-27T16:07:57.9841329Z fi_getinfo: -61 2022-09-27T16:07:57.9942588Z fi_getinfo: -61 2022-09-27T16:08:10.1879035Z ok (16.092s) 2022-09-27T16:08:10.1879288Z 2022-09-27T16:08:10.1879676Z ---------------------------------------------------------------------- 2022-09-27T16:08:10.1880017Z Ran 1 test in 16.092s 2022-09-27T16:08:10.1880178Z 2022-09-27T16:08:10.1880277Z OK 2022-09-27T16:08:10.1880394Z 2022-09-27T16:08:10.1880526Z Generating XML reports... 2022-09-27T16:08:10.1917956Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160754.xml 2022-09-27T16:08:12.1449881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:08:12.1451233Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:08:12.1452478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:08:12.1453416Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:08:12.3845104Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi40wt88n 2022-09-27T16:08:12.3846417Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi40wt88n/_remote_module_non_scriptable.py 2022-09-27T16:08:12.8270505Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:08:12.8287298Z 2022-09-27T16:08:12.8287620Z Running tests... 2022-09-27T16:08:12.8288357Z ---------------------------------------------------------------------- 2022-09-27T16:08:14.3212168Z test_rref_forward_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:08:14.3388398Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30980 2022-09-27T16:08:14.3396299Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30981 2022-09-27T16:08:14.3402596Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 30982 2022-09-27T16:08:14.3409025Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 30983 2022-09-27T16:08:15.9134761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:08:15.9135269Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:08:15.9137154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:08:15.9137644Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:08:15.9190707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:08:15.9191730Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:08:15.9194730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:08:15.9195188Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:08:15.9278688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:08:15.9279148Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:08:15.9282507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:08:15.9282972Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:08:15.9455921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:08:15.9456390Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:08:15.9460627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:08:15.9461091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:08:16.1522809Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplionnqfm 2022-09-27T16:08:16.1524244Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplionnqfm/_remote_module_non_scriptable.py 2022-09-27T16:08:16.1607851Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplwqlf1uv 2022-09-27T16:08:16.1608948Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplvh0z201 2022-09-27T16:08:16.1610063Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplwqlf1uv/_remote_module_non_scriptable.py 2022-09-27T16:08:16.1611408Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplvh0z201/_remote_module_non_scriptable.py 2022-09-27T16:08:16.1773513Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf46ygyzr 2022-09-27T16:08:16.1774968Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf46ygyzr/_remote_module_non_scriptable.py 2022-09-27T16:08:16.6008208Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:08:16.6092089Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:08:16.6094411Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:08:16.6273015Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:08:16.7223252Z fi_getinfo: -61 2022-09-27T16:08:16.7312077Z fi_getinfo: -61 2022-09-27T16:08:16.7316082Z fi_getinfo: -61 2022-09-27T16:08:16.7486539Z fi_getinfo: -61 2022-09-27T16:08:29.0692775Z ok (16.240s) 2022-09-27T16:08:29.0693002Z 2022-09-27T16:08:29.0693409Z ---------------------------------------------------------------------- 2022-09-27T16:08:29.0693757Z Ran 1 test in 16.240s 2022-09-27T16:08:29.0693921Z 2022-09-27T16:08:29.0694015Z OK 2022-09-27T16:08:29.0695086Z 2022-09-27T16:08:29.0696572Z Generating XML reports... 2022-09-27T16:08:29.0729918Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160812.xml 2022-09-27T16:08:31.0341509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:08:31.0342014Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:08:31.0343368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:08:31.0343848Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:08:31.2720667Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy1x7zhtq 2022-09-27T16:08:31.2721914Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy1x7zhtq/_remote_module_non_scriptable.py 2022-09-27T16:08:31.7103771Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:08:31.7119334Z 2022-09-27T16:08:31.7119777Z Running tests... 2022-09-27T16:08:31.7120297Z ---------------------------------------------------------------------- 2022-09-27T16:08:33.1958303Z test_rref_to_here_synchronization1 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:08:33.2134002Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31326 2022-09-27T16:08:33.2140877Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31327 2022-09-27T16:08:33.2147940Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 31328 2022-09-27T16:08:33.2154814Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 31329 2022-09-27T16:08:34.7953512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:08:34.7954025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:08:34.7956069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:08:34.7956539Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:08:34.8539171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:08:34.8539652Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:08:34.8543032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:08:34.8543737Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:08:34.8879549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:08:34.8880012Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:08:34.8882184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:08:34.8882659Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:08:34.8992660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:08:34.8993115Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:08:34.8996322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:08:34.8996805Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:08:35.0317420Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjt1oq2dl 2022-09-27T16:08:35.0318286Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjt1oq2dl/_remote_module_non_scriptable.py 2022-09-27T16:08:35.0688489Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9o43jfax 2022-09-27T16:08:35.0690832Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9o43jfax/_remote_module_non_scriptable.py 2022-09-27T16:08:35.1195402Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuotei1f1 2022-09-27T16:08:35.1198111Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuotei1f1/_remote_module_non_scriptable.py 2022-09-27T16:08:35.1209344Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpijq8owmk 2022-09-27T16:08:35.1212141Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpijq8owmk/_remote_module_non_scriptable.py 2022-09-27T16:08:35.4685671Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:08:35.4965477Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:08:35.5579899Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:08:35.5717399Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:08:35.5898537Z fi_getinfo: -61 2022-09-27T16:08:35.6177860Z fi_getinfo: -61 2022-09-27T16:08:35.6793055Z fi_getinfo: -61 2022-09-27T16:08:35.6932062Z fi_getinfo: -61 2022-09-27T16:08:49.5460498Z ok (17.834s) 2022-09-27T16:08:49.5460725Z 2022-09-27T16:08:49.5464561Z ---------------------------------------------------------------------- 2022-09-27T16:08:49.5464944Z Ran 1 test in 17.834s 2022-09-27T16:08:49.5465125Z 2022-09-27T16:08:49.5465230Z OK 2022-09-27T16:08:49.5465373Z 2022-09-27T16:08:49.5465523Z Generating XML reports... 2022-09-27T16:08:49.5500362Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160831.xml 2022-09-27T16:08:51.5230963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:08:51.5231537Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:08:51.5232727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:08:51.5233224Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:08:51.7606727Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptm1ztpu2 2022-09-27T16:08:51.7608782Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptm1ztpu2/_remote_module_non_scriptable.py 2022-09-27T16:08:52.2008138Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:08:52.2023230Z 2022-09-27T16:08:52.2023619Z Running tests... 2022-09-27T16:08:52.2024052Z ---------------------------------------------------------------------- 2022-09-27T16:08:53.6811572Z test_rref_to_here_synchronization2 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:08:53.6987599Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31673 2022-09-27T16:08:53.6994695Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31674 2022-09-27T16:08:53.7000936Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 31675 2022-09-27T16:08:53.7007761Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 31676 2022-09-27T16:08:55.2907086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:08:55.2907645Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:08:55.2909915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:08:55.2910404Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:08:55.2976036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:08:55.2976509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:08:55.2979690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:08:55.2980375Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:08:55.2996633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:08:55.2997097Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:08:55.3000492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:08:55.3000963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:08:55.3086037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:08:55.3086490Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:08:55.3090186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:08:55.3090657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:08:55.5350384Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyt6ew2oy 2022-09-27T16:08:55.5351713Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyt6ew2oy/_remote_module_non_scriptable.py 2022-09-27T16:08:55.5365812Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9tkmq4qu 2022-09-27T16:08:55.5368504Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9tkmq4qu/_remote_module_non_scriptable.py 2022-09-27T16:08:55.5420750Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpln0glz1v 2022-09-27T16:08:55.5423812Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpln0glz1v/_remote_module_non_scriptable.py 2022-09-27T16:08:55.5485996Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj7ceum2x 2022-09-27T16:08:55.5488868Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj7ceum2x/_remote_module_non_scriptable.py 2022-09-27T16:08:55.9861684Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:08:55.9885181Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:08:55.9889159Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:08:56.0018555Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:08:56.1078204Z fi_getinfo: -61 2022-09-27T16:08:56.1099466Z fi_getinfo: -61 2022-09-27T16:08:56.1105190Z fi_getinfo: -61 2022-09-27T16:08:56.1229596Z fi_getinfo: -61 2022-09-27T16:09:11.9353386Z ok (19.733s) 2022-09-27T16:09:11.9353598Z 2022-09-27T16:09:11.9356760Z ---------------------------------------------------------------------- 2022-09-27T16:09:11.9357159Z Ran 1 test in 19.733s 2022-09-27T16:09:11.9357308Z 2022-09-27T16:09:11.9357402Z OK 2022-09-27T16:09:11.9358825Z 2022-09-27T16:09:11.9359432Z Generating XML reports... 2022-09-27T16:09:11.9389982Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160852.xml 2022-09-27T16:09:13.9447344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:09:13.9448187Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:09:13.9449477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:09:13.9449941Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:09:14.1818399Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo954tewe 2022-09-27T16:09:14.1819258Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo954tewe/_remote_module_non_scriptable.py 2022-09-27T16:09:14.6249951Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:09:14.6265475Z 2022-09-27T16:09:14.6265691Z Running tests... 2022-09-27T16:09:14.6266125Z ---------------------------------------------------------------------- 2022-09-27T16:09:16.1047819Z test_rref_to_here_synchronization3 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:09:16.1224151Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32026 2022-09-27T16:09:16.1230586Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32027 2022-09-27T16:09:16.1236988Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 32028 2022-09-27T16:09:16.1243475Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 32029 2022-09-27T16:09:17.7599839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:09:17.7600356Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:09:17.7602256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:09:17.7602732Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:09:17.7674906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:09:17.7675361Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:09:17.7679490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:09:17.7679950Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:09:17.7706640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:09:17.7707086Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:09:17.7710386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:09:17.7711047Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:09:17.8168751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:09:17.8169259Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:09:17.8171347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:09:17.8171809Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:09:17.9998475Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_x3a17vp 2022-09-27T16:09:17.9999714Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_x3a17vp/_remote_module_non_scriptable.py 2022-09-27T16:09:18.0020412Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps9nrwy67 2022-09-27T16:09:18.0023268Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps9nrwy67/_remote_module_non_scriptable.py 2022-09-27T16:09:18.0036693Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppthxk8x0 2022-09-27T16:09:18.0039384Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppthxk8x0/_remote_module_non_scriptable.py 2022-09-27T16:09:18.0451835Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdtmvzmee 2022-09-27T16:09:18.0454417Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdtmvzmee/_remote_module_non_scriptable.py 2022-09-27T16:09:18.4424387Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:09:18.4427762Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:09:18.4457842Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:09:18.4926222Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:09:18.5644855Z fi_getinfo: -61 2022-09-27T16:09:18.5649320Z fi_getinfo: -61 2022-09-27T16:09:18.5669861Z fi_getinfo: -61 2022-09-27T16:09:18.6141592Z fi_getinfo: -61 2022-09-27T16:09:32.1552879Z ok (17.528s) 2022-09-27T16:09:32.1555862Z 2022-09-27T16:09:32.1556457Z ---------------------------------------------------------------------- 2022-09-27T16:09:32.1557047Z Ran 1 test in 17.529s 2022-09-27T16:09:32.1557374Z 2022-09-27T16:09:32.1557562Z OK 2022-09-27T16:09:32.1557855Z 2022-09-27T16:09:32.1561526Z Generating XML reports... 2022-09-27T16:09:32.1589507Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160914.xml 2022-09-27T16:09:34.1327798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:09:34.1328291Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:09:34.1330143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:09:34.1330640Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:09:34.3598518Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgnfdb2or 2022-09-27T16:09:34.3600521Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgnfdb2or/_remote_module_non_scriptable.py 2022-09-27T16:09:34.7837300Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:09:34.7852297Z 2022-09-27T16:09:34.7852599Z Running tests... 2022-09-27T16:09:34.7853037Z ---------------------------------------------------------------------- 2022-09-27T16:09:36.2617582Z test_rref_to_here_synchronization4 (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:09:36.2795101Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32373 2022-09-27T16:09:36.2801025Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32374 2022-09-27T16:09:36.2807484Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 32375 2022-09-27T16:09:36.2813763Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 32376 2022-09-27T16:09:37.8686779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:09:37.8687281Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:09:37.8688601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:09:37.8689054Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:09:37.8717831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:09:37.8718723Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:09:37.8721945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:09:37.8722404Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:09:37.9047672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:09:37.9048143Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:09:37.9051319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:09:37.9051780Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:09:37.9714862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:09:37.9715376Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:09:37.9717775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:09:37.9718243Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:09:38.0970449Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwv6op02_ 2022-09-27T16:09:38.0971436Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwv6op02_/_remote_module_non_scriptable.py 2022-09-27T16:09:38.1119962Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq9r5v2bh 2022-09-27T16:09:38.1122593Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq9r5v2bh/_remote_module_non_scriptable.py 2022-09-27T16:09:38.1226514Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdhp1nrld 2022-09-27T16:09:38.1229209Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdhp1nrld/_remote_module_non_scriptable.py 2022-09-27T16:09:38.1980790Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzebgkarn 2022-09-27T16:09:38.1983271Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzebgkarn/_remote_module_non_scriptable.py 2022-09-27T16:09:38.5419841Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:09:38.5547153Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:09:38.5553919Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:09:38.6487284Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:09:38.6635581Z fi_getinfo: -61 2022-09-27T16:09:38.6762268Z fi_getinfo: -61 2022-09-27T16:09:38.6768213Z fi_getinfo: -61 2022-09-27T16:09:38.7707939Z fi_getinfo: -61 2022-09-27T16:09:54.8237876Z ok (20.038s) 2022-09-27T16:09:54.8238122Z 2022-09-27T16:09:54.8238529Z ---------------------------------------------------------------------- 2022-09-27T16:09:54.8238873Z Ran 1 test in 20.038s 2022-09-27T16:09:54.8241614Z 2022-09-27T16:09:54.8241889Z OK 2022-09-27T16:09:54.8242324Z 2022-09-27T16:09:54.8242501Z Generating XML reports... 2022-09-27T16:09:54.8276179Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160934.xml 2022-09-27T16:09:56.7705853Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:09:56.7706343Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:09:56.7707814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:09:56.7708294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:09:56.9999294Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps4e_qwq5 2022-09-27T16:09:57.0000125Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps4e_qwq5/_remote_module_non_scriptable.py 2022-09-27T16:09:57.4253290Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:09:57.4267969Z 2022-09-27T16:09:57.4268384Z Running tests... 2022-09-27T16:09:57.4268868Z ---------------------------------------------------------------------- 2022-09-27T16:09:58.8891298Z test_rref_with_unpickleable_attributes (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:09:58.9068033Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32726 2022-09-27T16:09:58.9074827Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32727 2022-09-27T16:09:58.9081212Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 32728 2022-09-27T16:09:58.9087411Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 32729 2022-09-27T16:10:00.4970230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:00.4970999Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:00.4971818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:00.4972295Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:00.4994323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:00.4994789Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:00.4999141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:00.4999627Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:00.5040491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:00.5040958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:00.5044953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:00.5045428Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:00.5089958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:00.5090430Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:00.5094527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:00.5095004Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:00.7347558Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsi_lgw7i 2022-09-27T16:10:00.7348422Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsi_lgw7i/_remote_module_non_scriptable.py 2022-09-27T16:10:00.7351070Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpegxcvzm7 2022-09-27T16:10:00.7354199Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpegxcvzm7/_remote_module_non_scriptable.py 2022-09-27T16:10:00.7441361Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw1lrcce0 2022-09-27T16:10:00.7444711Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw1lrcce0/_remote_module_non_scriptable.py 2022-09-27T16:10:00.7479305Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7vh6yft7 2022-09-27T16:10:00.7481917Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7vh6yft7/_remote_module_non_scriptable.py 2022-09-27T16:10:01.1841597Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:10:01.1868182Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:10:01.2052695Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:10:01.2062928Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:10:01.3061138Z fi_getinfo: -61 2022-09-27T16:10:01.3083194Z fi_getinfo: -61 2022-09-27T16:10:01.3268541Z fi_getinfo: -61 2022-09-27T16:10:01.3276874Z fi_getinfo: -61 2022-09-27T16:10:04.9247519Z ok (7.498s) 2022-09-27T16:10:04.9247732Z 2022-09-27T16:10:04.9248132Z ---------------------------------------------------------------------- 2022-09-27T16:10:04.9248448Z Ran 1 test in 7.498s 2022-09-27T16:10:04.9248614Z 2022-09-27T16:10:04.9252784Z OK 2022-09-27T16:10:04.9253200Z 2022-09-27T16:10:04.9253464Z Generating XML reports... 2022-09-27T16:10:04.9288361Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160957.xml 2022-09-27T16:10:06.8971172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:06.8971689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:06.8973798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:06.8974296Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:07.1267852Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3pzkas81 2022-09-27T16:10:07.1269053Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3pzkas81/_remote_module_non_scriptable.py 2022-09-27T16:10:07.5533884Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:10:07.5548764Z 2022-09-27T16:10:07.5548980Z Running tests... 2022-09-27T16:10:07.5549461Z ---------------------------------------------------------------------- 2022-09-27T16:10:09.0085534Z test_tensor_view_as_return_value (__main__.TensorPipeTensorPipeAgentCudaRpcTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:10:09.0263517Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33077 2022-09-27T16:10:09.0269786Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33078 2022-09-27T16:10:09.0276667Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 33079 2022-09-27T16:10:09.0283305Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 33080 2022-09-27T16:10:10.6020476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:10.6021457Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:10.6022696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:10.6024014Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:10.6050648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:10.6051534Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:10.6055362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:10.6056331Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:10.6329835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:10.6330672Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:10.6333585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:10.6334864Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:10.6521290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:10.6522212Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:10.6524659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:10.6525619Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:10.8267734Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpths1ycyw 2022-09-27T16:10:10.8268893Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpths1ycyw/_remote_module_non_scriptable.py 2022-09-27T16:10:10.8480481Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0yv4fgm6 2022-09-27T16:10:10.8482659Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0yv4fgm6/_remote_module_non_scriptable.py 2022-09-27T16:10:10.8650945Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzlse63yf 2022-09-27T16:10:10.8653179Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzlse63yf/_remote_module_non_scriptable.py 2022-09-27T16:10:10.8727119Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcbjplk7h 2022-09-27T16:10:10.8729387Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcbjplk7h/_remote_module_non_scriptable.py 2022-09-27T16:10:11.2680397Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:10:11.2907477Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:10:11.3144061Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:10:11.3186323Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:10:11.3897674Z fi_getinfo: -61 2022-09-27T16:10:11.4125086Z fi_getinfo: -61 2022-09-27T16:10:11.4358481Z fi_getinfo: -61 2022-09-27T16:10:11.4404779Z fi_getinfo: -61 2022-09-27T16:10:16.8449483Z ok (9.290s) 2022-09-27T16:10:16.8449720Z 2022-09-27T16:10:16.8450380Z ---------------------------------------------------------------------- 2022-09-27T16:10:16.8450724Z Ran 1 test in 9.290s 2022-09-27T16:10:16.8450890Z 2022-09-27T16:10:16.8450986Z OK 2022-09-27T16:10:16.8454059Z 2022-09-27T16:10:16.8454417Z Generating XML reports... 2022-09-27T16:10:16.8487801Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927161007.xml 2022-09-27T16:10:18.8332202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:18.8333266Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:18.8334837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:18.8335911Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:19.0727143Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa9kdtxm0 2022-09-27T16:10:19.0728379Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa9kdtxm0/_remote_module_non_scriptable.py 2022-09-27T16:10:19.5155961Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:10:19.5171388Z 2022-09-27T16:10:19.5171744Z Running tests... 2022-09-27T16:10:19.5172654Z ---------------------------------------------------------------------- 2022-09-27T16:10:21.0059528Z test_device_maps_backward_pass (__main__.TensorPipeTensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:10:21.0242830Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33724 2022-09-27T16:10:21.0250409Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33725 2022-09-27T16:10:21.0257012Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 33726 2022-09-27T16:10:21.0263549Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 33727 2022-09-27T16:10:22.6174630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:22.6175150Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:22.6175720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:22.6176177Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:22.6176734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:22.6177200Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:22.6177772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:22.6178240Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:22.6178820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:22.6179266Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:22.6179846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:22.6180301Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:22.6264766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:22.6265215Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:22.6268449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:22.6268921Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:22.8576242Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpydife9h5 2022-09-27T16:10:22.8577306Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpydife9h5/_remote_module_non_scriptable.py 2022-09-27T16:10:22.8782697Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5_cfr3ym 2022-09-27T16:10:22.8785752Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5_cfr3ym/_remote_module_non_scriptable.py 2022-09-27T16:10:22.8787411Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg1cox3id 2022-09-27T16:10:22.8790438Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg1cox3id/_remote_module_non_scriptable.py 2022-09-27T16:10:22.8822279Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8no55o0q 2022-09-27T16:10:22.8825157Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8no55o0q/_remote_module_non_scriptable.py 2022-09-27T16:10:23.3070060Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:10:23.3242543Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:10:23.3243064Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:10:23.3251783Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:10:23.7330612Z skip: Need at least 4 CUDA devices (4.216s) 2022-09-27T16:10:23.7330854Z 2022-09-27T16:10:23.7331229Z ---------------------------------------------------------------------- 2022-09-27T16:10:23.7331901Z Ran 1 test in 4.216s 2022-09-27T16:10:23.7332065Z 2022-09-27T16:10:23.7332177Z OK (skipped=1) 2022-09-27T16:10:23.7332332Z 2022-09-27T16:10:23.7332460Z Generating XML reports... 2022-09-27T16:10:23.7368950Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20220927161019.xml 2022-09-27T16:10:25.7188071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:25.7188885Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:25.7190105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:25.7190556Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:25.9520546Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjefmn45k 2022-09-27T16:10:25.9521679Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjefmn45k/_remote_module_non_scriptable.py 2022-09-27T16:10:26.3826459Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:10:26.3842122Z 2022-09-27T16:10:26.3842273Z Running tests... 2022-09-27T16:10:26.3842960Z ---------------------------------------------------------------------- 2022-09-27T16:10:27.8500034Z test_dist_autograd_sync_streams (__main__.TensorPipeTensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:10:27.8678372Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33895 2022-09-27T16:10:27.8684978Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33896 2022-09-27T16:10:27.8691105Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 33897 2022-09-27T16:10:27.8697457Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 33898 2022-09-27T16:10:29.4582486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:29.4583231Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:29.4584074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:29.4584556Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:29.4834543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:29.4835023Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:29.4838101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:29.4838598Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:29.4898425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:29.4898905Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:29.4902998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:29.4903510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:29.4972502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:29.4972960Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:29.4976410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:29.4976889Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:29.7165068Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2gcs4qc7 2022-09-27T16:10:29.7166446Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2gcs4qc7/_remote_module_non_scriptable.py 2022-09-27T16:10:29.7171436Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4pged8ge 2022-09-27T16:10:29.7173964Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2uagc181 2022-09-27T16:10:29.7174527Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4pged8ge/_remote_module_non_scriptable.py 2022-09-27T16:10:29.7176687Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2uagc181/_remote_module_non_scriptable.py 2022-09-27T16:10:29.7292132Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6jx51k2k 2022-09-27T16:10:29.7294884Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6jx51k2k/_remote_module_non_scriptable.py 2022-09-27T16:10:30.1652896Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:10:30.1703226Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:10:30.1714260Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:10:30.1832694Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:10:30.5767210Z skip: Need at least 4 CUDA devices (4.192s) 2022-09-27T16:10:30.5767441Z 2022-09-27T16:10:30.5767810Z ---------------------------------------------------------------------- 2022-09-27T16:10:30.5768141Z Ran 1 test in 4.192s 2022-09-27T16:10:30.5768305Z 2022-09-27T16:10:30.5768415Z OK (skipped=1) 2022-09-27T16:10:30.5768569Z 2022-09-27T16:10:30.5768692Z Generating XML reports... 2022-09-27T16:10:30.5811394Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20220927161026.xml 2022-09-27T16:10:32.5517893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:32.5518406Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:32.5520608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:32.5521083Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:32.7876818Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpteimsgyg 2022-09-27T16:10:32.7878130Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpteimsgyg/_remote_module_non_scriptable.py 2022-09-27T16:10:33.2326315Z Test results will be stored in test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent 2022-09-27T16:10:33.2342681Z 2022-09-27T16:10:33.2342922Z Running tests... 2022-09-27T16:10:33.2343378Z ---------------------------------------------------------------------- 2022-09-27T16:10:34.7533440Z test_gradients_synchronizations (__main__.TensorPipeTensorPipeCudaDistAutogradTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:10:34.7717439Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34066 2022-09-27T16:10:34.7724785Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34067 2022-09-27T16:10:34.7731386Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 34068 2022-09-27T16:10:34.7738476Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 34069 2022-09-27T16:10:36.3537531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:36.3538027Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:36.3540017Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:36.3540488Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:36.3541559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:36.3542025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:36.3545534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:36.3545989Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:36.4215087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:36.4215560Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:36.4218216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:36.4218671Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:36.4637664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:36.4638145Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:36.4640503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:36.4640982Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:36.5768803Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpeiaswz_9 2022-09-27T16:10:36.5770171Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpeiaswz_9/_remote_module_non_scriptable.py 2022-09-27T16:10:36.5930577Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn84tj9pd 2022-09-27T16:10:36.5933233Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn84tj9pd/_remote_module_non_scriptable.py 2022-09-27T16:10:36.6393315Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz_puq3xb 2022-09-27T16:10:36.6396028Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz_puq3xb/_remote_module_non_scriptable.py 2022-09-27T16:10:36.6885323Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9at_0n2k 2022-09-27T16:10:36.6888043Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9at_0n2k/_remote_module_non_scriptable.py 2022-09-27T16:10:37.0204858Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:10:37.0313691Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:10:37.0818926Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:10:37.1552457Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:10:37.5806145Z skip: Need at least 4 CUDA devices (4.346s) 2022-09-27T16:10:37.5806401Z 2022-09-27T16:10:37.5806800Z ---------------------------------------------------------------------- 2022-09-27T16:10:37.5807142Z Ran 1 test in 4.346s 2022-09-27T16:10:37.5807288Z 2022-09-27T16:10:37.5807669Z OK (skipped=1) 2022-09-27T16:10:37.5807842Z 2022-09-27T16:10:37.5807970Z Generating XML reports... 2022-09-27T16:10:37.5844225Z Generated XML report: test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20220927161033.xml 2022-09-27T16:10:38.1823291Z Running distributed/fsdp/test_fsdp_core ... [2022-09-27 16:10:38.181839] 2022-09-27T16:10:38.1824019Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_core.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:10:38.181919] 2022-09-27T16:10:40.0720524Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_core 2022-09-27T16:10:40.0751490Z 2022-09-27T16:10:40.0751916Z Running tests... 2022-09-27T16:10:40.0752671Z ---------------------------------------------------------------------- 2022-09-27T16:10:40.0757860Z test_pre_backward_hook_registration_after_state_dict (__main__.TestHooks) 2022-09-27T16:10:41.6003542Z Tests that FSDP pre-backward hooks are registered on forward pass ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:10:41.6189516Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34237 2022-09-27T16:10:41.6196527Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34238 2022-09-27T16:10:43.2591551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:43.2592059Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:43.2594750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:43.2595236Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:43.3221397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:43.3221877Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:43.3225199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:43.3225683Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:43.5096938Z dist init r=0, world=2 2022-09-27T16:10:43.5101107Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:10:43.5645198Z dist init r=1, world=2 2022-09-27T16:10:43.5650370Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:10:43.5651406Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:10:43.5710255Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:10:44.9337130Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:10:44.9338149Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:10:44.9629994Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:10:44.9631824Z warnings.warn( 2022-09-27T16:10:44.9666571Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:10:44.9668134Z warnings.warn( 2022-09-27T16:10:46.2291830Z ok (6.154s) 2022-09-27T16:10:46.2295795Z test_pre_backward_hook_registration_cuda_first_False (__main__.TestHooks) 2022-09-27T16:10:46.2310991Z Tests that FSDP pre-backward hooks are registered on forward pass ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34322 2022-09-27T16:10:46.2317708Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34323 2022-09-27T16:10:47.8474864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:47.8475355Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:47.8478586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:47.8479358Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:47.8691466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:47.8691919Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:47.8696096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:47.8696828Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:48.1067331Z dist init r=0, world=2 2022-09-27T16:10:48.1070894Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:10:48.1157217Z dist init r=1, world=2 2022-09-27T16:10:48.1162025Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:10:48.1163069Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:10:48.1173514Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:10:49.4779368Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:10:49.4779898Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:10:49.5111236Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:10:49.5112056Z warnings.warn( 2022-09-27T16:10:49.5113160Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:10:49.5113895Z warnings.warn( 2022-09-27T16:10:50.7412882Z ok (4.512s) 2022-09-27T16:10:50.7416655Z test_pre_backward_hook_registration_cuda_first_True (__main__.TestHooks) 2022-09-27T16:10:50.7430730Z Tests that FSDP pre-backward hooks are registered on forward pass ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34407 2022-09-27T16:10:50.7437352Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34408 2022-09-27T16:10:52.3944619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:52.3945403Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:52.3947830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:52.3948572Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:52.4193151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:52.4193619Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:52.4198052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:52.4198767Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:52.6587357Z dist init r=0, world=2 2022-09-27T16:10:52.6591226Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:10:52.6611882Z dist init r=1, world=2 2022-09-27T16:10:52.6616937Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:10:52.6618137Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:10:52.6694414Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:10:54.0229545Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:10:54.0230078Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:10:55.2528766Z ok (4.511s) 2022-09-27T16:10:55.2537270Z test_register_functions_called_cuda_first_False_mixed_precision_False (__main__.TestHooks) 2022-09-27T16:10:55.2551220Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34492 2022-09-27T16:10:55.2558289Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34493 2022-09-27T16:10:56.9123870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:56.9124367Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:56.9127569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:56.9128040Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:56.9139176Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:10:56.9139642Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:10:56.9143875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:10:56.9144340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:10:57.1482284Z dist init r=1, world=2 2022-09-27T16:10:57.1485892Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:10:57.1694355Z dist init r=0, world=2 2022-09-27T16:10:57.1699537Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:10:57.1700323Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:10:57.1791806Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:10:58.5249842Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:10:58.5250416Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:10:58.5547667Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:10:58.5548468Z warnings.warn( 2022-09-27T16:10:58.5549829Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:10:58.5550858Z warnings.warn( 2022-09-27T16:10:59.7649827Z ok (4.512s) 2022-09-27T16:10:59.7658144Z test_register_functions_called_cuda_first_False_mixed_precision_True (__main__.TestHooks) 2022-09-27T16:10:59.7672131Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34573 2022-09-27T16:10:59.7678945Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34574 2022-09-27T16:11:01.4106516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:01.4107030Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:01.4110000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:01.4110487Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:01.4299120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:01.4299610Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:01.4303520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:01.4303997Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:01.6751439Z dist init r=1, world=2 2022-09-27T16:11:01.6755762Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:11:01.6828868Z dist init r=0, world=2 2022-09-27T16:11:01.6834440Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:11:01.6835464Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:01.6858441Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:03.0741243Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:11:03.0741799Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:11:03.1020608Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1270: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-09-27T16:11:03.1021467Z warnings.warn( 2022-09-27T16:11:03.1035597Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:11:03.1036393Z warnings.warn( 2022-09-27T16:11:03.1053322Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1270: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-09-27T16:11:03.1054046Z warnings.warn( 2022-09-27T16:11:03.1067373Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:11:03.1068286Z warnings.warn( 2022-09-27T16:11:04.2789041Z ok (4.514s) 2022-09-27T16:11:04.2797838Z test_register_functions_called_cuda_first_True_mixed_precision_False (__main__.TestHooks) 2022-09-27T16:11:04.2812078Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34654 2022-09-27T16:11:04.2818436Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34655 2022-09-27T16:11:05.9231989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:05.9232480Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:05.9235708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:05.9236206Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:05.9388860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:05.9389298Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:05.9393758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:05.9394236Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:06.1881188Z dist init r=0, world=2 2022-09-27T16:11:06.1885840Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:11:06.1927855Z dist init r=1, world=2 2022-09-27T16:11:06.1933260Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:11:06.1934236Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:06.1988965Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:07.5614378Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:11:07.5614884Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:11:08.8924034Z ok (4.613s) 2022-09-27T16:11:08.8932767Z test_register_functions_called_cuda_first_True_mixed_precision_True (__main__.TestHooks) 2022-09-27T16:11:08.8946688Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34735 2022-09-27T16:11:08.8953598Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34736 2022-09-27T16:11:10.5192299Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:10.5193137Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:10.5195798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:10.5196282Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:10.5525414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:10.5525870Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:10.5530336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:10.5530841Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:10.7727007Z dist init r=0, world=2 2022-09-27T16:11:10.7730541Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:11:10.7999333Z dist init r=1, world=2 2022-09-27T16:11:10.8004345Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:11:10.8005133Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:10.8035913Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:12.1650115Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:11:12.1650635Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:11:12.1970699Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1270: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-09-27T16:11:12.1971468Z warnings.warn( 2022-09-27T16:11:12.1999664Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1270: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-09-27T16:11:12.2000398Z warnings.warn( 2022-09-27T16:11:13.4043258Z ok (4.512s) 2022-09-27T16:11:13.4052956Z test_transformer_no_grad_mixed_precision_False (__main__.TestNoGrad) 2022-09-27T16:11:13.4068789Z Tests that for an FSDP-wrapped transformer model with shared ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34816 2022-09-27T16:11:13.4075896Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34817 2022-09-27T16:11:15.0467232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:15.0467734Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:15.0470630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:15.0471335Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:15.0664397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:15.0664863Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:15.0669577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:15.0670065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:15.3093555Z dist init r=1, world=2 2022-09-27T16:11:15.3097490Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:11:15.3169814Z dist init r=0, world=2 2022-09-27T16:11:15.3175230Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:11:15.3176024Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:15.3200312Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:16.6998415Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:11:16.6999261Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:11:16.7310525Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:11:16.7311723Z warnings.warn( 2022-09-27T16:11:16.7312850Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:11:16.7313598Z warnings.warn( 2022-09-27T16:11:18.0171508Z ok (4.613s) 2022-09-27T16:11:18.0186750Z test_transformer_no_grad_mixed_precision_True (__main__.TestNoGrad) 2022-09-27T16:11:18.0200499Z Tests that for an FSDP-wrapped transformer model with shared ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34901 2022-09-27T16:11:18.0207391Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34902 2022-09-27T16:11:19.6438014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:19.6438518Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:19.6441024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:19.6441494Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:19.6767085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:19.6767551Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:19.6772222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:19.6772684Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:19.8981051Z dist init r=0, world=2 2022-09-27T16:11:19.8985773Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:11:19.9122548Z dist init r=1, world=2 2022-09-27T16:11:19.9127586Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:11:19.9128650Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:19.9190226Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:21.2717026Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:11:21.2717579Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:11:21.3017925Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1270: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-09-27T16:11:21.3018697Z warnings.warn( 2022-09-27T16:11:21.3019759Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1270: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-09-27T16:11:21.3020642Z warnings.warn( 2022-09-27T16:11:21.3031875Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:11:21.3032655Z warnings.warn( 2022-09-27T16:11:21.3033743Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:11:21.3034497Z warnings.warn( 2022-09-27T16:11:22.5297738Z ok (4.512s) 2022-09-27T16:11:22.5306646Z test_param_change_after_init_mixed_precision_False (__main__.TestParamInit) 2022-09-27T16:11:22.5320390Z Tests that changing FSDP model parameter values in-place after FSDP ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34986 2022-09-27T16:11:22.5326394Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34987 2022-09-27T16:11:24.1895183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:24.1895791Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:24.1898459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:24.1898984Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:24.2195657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:24.2196132Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:24.2200122Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:24.2200621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:24.4521771Z dist init r=0, world=2 2022-09-27T16:11:24.4525758Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:11:24.4671889Z dist init r=1, world=2 2022-09-27T16:11:24.4677516Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:11:24.4678590Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:24.4731280Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:25.8408037Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:11:25.8408578Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:11:25.8712151Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:11:25.8713181Z warnings.warn( 2022-09-27T16:11:25.8751713Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:11:25.8752476Z warnings.warn( 2022-09-27T16:11:27.0417092Z ok (4.512s) 2022-09-27T16:11:27.0425348Z test_param_change_after_init_mixed_precision_True (__main__.TestParamInit) 2022-09-27T16:11:27.0439781Z Tests that changing FSDP model parameter values in-place after FSDP ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35067 2022-09-27T16:11:27.0445957Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35068 2022-09-27T16:11:28.6879973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:28.6880541Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:28.6886139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:28.6886629Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:28.6959389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:28.6959862Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:28.6964557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:28.6965036Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:28.9454981Z dist init r=0, world=2 2022-09-27T16:11:28.9459222Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:11:28.9556678Z dist init r=1, world=2 2022-09-27T16:11:28.9562110Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:11:28.9562872Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:28.9563560Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:30.3355025Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:11:30.3355542Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:11:30.3663437Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1270: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-09-27T16:11:30.3664408Z warnings.warn( 2022-09-27T16:11:30.3678123Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:11:30.3678892Z warnings.warn( 2022-09-27T16:11:30.3698745Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1270: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-09-27T16:11:30.3699624Z warnings.warn( 2022-09-27T16:11:30.3713738Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:11:30.3714500Z warnings.warn( 2022-09-27T16:11:31.5535795Z ok (4.512s) 2022-09-27T16:11:31.5541552Z test_delayed_optim_step_offload_false_no_shard (__main__.TestParityWithDDP) 2022-09-27T16:11:31.5557200Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35148 2022-09-27T16:11:31.5564161Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35149 2022-09-27T16:11:33.1696284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:33.1696772Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:33.1699792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:33.1700272Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:33.2179411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:33.2179862Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:33.2184517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:33.2184999Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:33.4216618Z dist init r=0, world=2 2022-09-27T16:11:33.4220484Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:11:33.4608029Z dist init r=1, world=2 2022-09-27T16:11:33.4613752Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:11:33.4614531Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:33.4627437Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:34.8150521Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:11:34.8151291Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:11:35.5199351Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:35.5199880Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:35.5232943Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:11:35.5233775Z warnings.warn( 2022-09-27T16:11:35.5234895Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:11:35.5235759Z warnings.warn( 2022-09-27T16:11:36.0995819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:11:36.0996517Z warnings.warn(msg, FutureWarning) 2022-09-27T16:11:36.1006053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:11:36.1006718Z warnings.warn(msg, FutureWarning) 2022-09-27T16:11:36.2487481Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:36.2487988Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:36.8428179Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:36.8428694Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:37.4343366Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:37.4343893Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:38.0258051Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:38.0258557Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:38.6176583Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:38.6177133Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:39.2088572Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:39.2089086Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:39.8033227Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:39.8033763Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:40.3971359Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:40.3971864Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:40.9918033Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:40.9918567Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:41.5855798Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:41.5856472Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:41.7421264Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:11:41.7422205Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-09-27T16:11:41.7423463Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:11:41.7424598Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-09-27T16:11:42.1799559Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:42.1800118Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:43.0760070Z ok (11.522s) 2022-09-27T16:11:43.0764739Z test_delayed_optim_step_offload_false_none (__main__.TestParityWithDDP) 2022-09-27T16:11:43.0779940Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35233 2022-09-27T16:11:43.0787418Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35234 2022-09-27T16:11:44.7169147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:44.7169665Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:44.7172105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:44.7172592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:44.7333697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:11:44.7334161Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:11:44.7338007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:11:44.7338482Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:11:44.9760762Z dist init r=1, world=2 2022-09-27T16:11:44.9761473Z dist init r=0, world=2 2022-09-27T16:11:44.9764199Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:11:44.9767345Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:11:44.9768374Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:44.9867040Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:11:46.3275559Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:11:46.3276079Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:11:47.0150590Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:47.0151326Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:47.0184038Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:11:47.0184946Z warnings.warn( 2022-09-27T16:11:47.0186059Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:11:47.0186913Z warnings.warn( 2022-09-27T16:11:47.7987665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:11:47.7988390Z warnings.warn(msg, FutureWarning) 2022-09-27T16:11:47.7989321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:11:47.7989975Z warnings.warn(msg, FutureWarning) 2022-09-27T16:11:48.0505864Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:48.0506384Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:49.0850585Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:49.0851112Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:50.1189426Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:50.1189955Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:51.1537713Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:51.1538223Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:52.1877937Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:52.1878476Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:53.2221425Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:53.2221975Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:54.2589641Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:54.2590177Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:55.2960059Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:55.2960602Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:56.3332923Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:56.3333933Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:57.3711739Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:57.3712744Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:57.6313468Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6316052Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6318520Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6321183Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6323535Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6325977Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6328421Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6330811Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6332101Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6333322Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6334562Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6335758Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6337095Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6338317Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6339528Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6340793Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6341985Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6343176Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6344373Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:57.6345571Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:11:58.4109148Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:58.4110183Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:11:59.7063084Z ok (16.630s) 2022-09-27T16:11:59.7068897Z test_delayed_optim_step_offload_false_shard_grad_op (__main__.TestParityWithDDP) 2022-09-27T16:11:59.7084736Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35318 2022-09-27T16:11:59.7092238Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35319 2022-09-27T16:12:01.3507242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:12:01.3507732Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:12:01.3510909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:12:01.3511628Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:12:01.3769012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:12:01.3769450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:12:01.3773935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:12:01.3774437Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:12:01.6127914Z dist init r=1, world=2 2022-09-27T16:12:01.6132484Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:12:01.6268222Z dist init r=0, world=2 2022-09-27T16:12:01.6273656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:12:01.6274789Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:12:01.6336602Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:12:03.0023975Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:12:03.0024563Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:12:03.7145625Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:03.7146178Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:03.7179726Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:12:03.7180629Z warnings.warn( 2022-09-27T16:12:03.7181735Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:12:03.7182478Z warnings.warn( 2022-09-27T16:12:04.5007836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:12:04.5008582Z warnings.warn(msg, FutureWarning) 2022-09-27T16:12:04.5026787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:12:04.5027461Z warnings.warn(msg, FutureWarning) 2022-09-27T16:12:04.7547241Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:04.7547723Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:05.7903565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:05.7904104Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:06.8256289Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:06.8256844Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:07.8613075Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:07.8613908Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:08.8966831Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:08.8967368Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:09.9320203Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:09.9320801Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:10.9699093Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:10.9699601Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:12.0083079Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:12.0083824Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:13.0464247Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:13.0464763Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:14.0849020Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:14.0849532Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:14.3456114Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3457427Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3458649Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3459873Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3461068Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3462335Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3463534Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3465004Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3466262Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3467462Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3468666Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3469980Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3471478Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3472673Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3473886Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3475081Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3476272Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3477478Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3478669Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:14.3479952Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:15.1250495Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:15.1251175Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:16.3362932Z ok (16.630s) 2022-09-27T16:12:16.3368103Z test_delayed_optim_step_offload_true_no_shard (__main__.TestParityWithDDP) 2022-09-27T16:12:16.3372220Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82490 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-09-27T16:12:16.3376942Z test_delayed_optim_step_offload_true_none (__main__.TestParityWithDDP) 2022-09-27T16:12:16.3390460Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35403 2022-09-27T16:12:16.3397311Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35404 2022-09-27T16:12:17.9715281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:12:17.9715928Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:12:17.9718699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:12:17.9719159Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:12:17.9927323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:12:17.9927805Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:12:17.9932085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:12:17.9932546Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:12:18.2314024Z dist init r=1, world=2 2022-09-27T16:12:18.2318118Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:12:18.2398349Z dist init r=0, world=2 2022-09-27T16:12:18.2403641Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:12:18.2404858Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:12:18.2420785Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:12:19.6264116Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:12:19.6264658Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:12:20.3194077Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:20.3194619Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:20.3226904Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:12:20.3227792Z warnings.warn( 2022-09-27T16:12:20.3229179Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:12:20.3229939Z warnings.warn( 2022-09-27T16:12:20.8225943Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:20.8226479Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:20.8276452Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:20.8278027Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:21.3254544Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:21.3255056Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:21.8285075Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:21.8285625Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:21.8312556Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:21.8313856Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:21.8315089Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:21.8316286Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:21.8317514Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:21.8382330Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:21.8383555Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:21.8385052Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:21.8386287Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:21.8387491Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:22.3316508Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:22.3317029Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:22.3361797Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:22.3364462Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:22.8342058Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:22.8342586Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:23.3375596Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:23.3376180Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:23.5867820Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:23.5869122Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:23.5870378Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:23.5871795Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:23.5873005Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:23.5874374Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:23.5875606Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:23.5876814Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:23.5878127Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:23.5879858Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:24.1478471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:12:24.1479208Z warnings.warn(msg, FutureWarning) 2022-09-27T16:12:24.1480956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:12:24.1481609Z warnings.warn(msg, FutureWarning) 2022-09-27T16:12:24.4002786Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:24.4003278Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:25.4604274Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:25.4604803Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:26.5206394Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:26.5206960Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:27.5813157Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:27.5813696Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:28.6410214Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:28.6410766Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:29.7006282Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:29.7006809Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:29.9565326Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:593: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:12:29.9566191Z world_indices[ 2022-09-27T16:12:29.9567375Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:593: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:12:29.9568264Z world_indices[ 2022-09-27T16:12:30.7556772Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:30.7557335Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:31.8094315Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:31.8094855Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:32.8635999Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:32.8636534Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:33.9181359Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:33.9181908Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:34.9715210Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:34.9715782Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:36.2721675Z ok (19.935s) 2022-09-27T16:12:36.2727039Z test_delayed_optim_step_offload_true_shard_grad_op (__main__.TestParityWithDDP) 2022-09-27T16:12:36.2741770Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35488 2022-09-27T16:12:36.2749130Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35489 2022-09-27T16:12:37.9479054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:12:37.9479591Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:12:37.9482272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:12:37.9482792Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:12:37.9811633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:12:37.9812128Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:12:37.9816309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:12:37.9816799Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:12:38.2129428Z dist init r=0, world=2 2022-09-27T16:12:38.2133757Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:12:38.2237445Z dist init r=1, world=2 2022-09-27T16:12:38.2242470Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:12:38.2243494Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:12:38.2338471Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:12:39.5966216Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:12:39.5966791Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:12:40.2826693Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:40.2827251Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:40.2859251Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:12:40.2860714Z warnings.warn( 2022-09-27T16:12:40.2862388Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:12:40.2863136Z warnings.warn( 2022-09-27T16:12:40.7858633Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:40.7859518Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:40.7904045Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:40.7906071Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:41.2881699Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:41.2882720Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:41.7912619Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:41.7913399Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:41.7936445Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:41.7938338Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:41.7939573Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:41.7941033Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:41.7942275Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:41.7943477Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:41.7944782Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:41.7945986Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:41.7947192Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:41.7948397Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:42.2935340Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:42.2936340Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:42.2981106Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:42.2983056Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:42.7967345Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:42.7968149Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:43.2989883Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:43.2991107Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:43.5483350Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:43.5484834Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:43.5486081Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:43.5487293Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:43.5488607Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:43.5489809Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:43.5491006Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:43.5492210Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:43.5493421Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:43.5494615Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:12:44.1082767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:12:44.1083608Z warnings.warn(msg, FutureWarning) 2022-09-27T16:12:44.1085204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:12:44.1085847Z warnings.warn(msg, FutureWarning) 2022-09-27T16:12:44.3603190Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:44.3604198Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:45.4192162Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:45.4193364Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:46.4783390Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:46.4784352Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:47.5377481Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:47.5378351Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:48.5966638Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:48.5967381Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:49.6561313Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:49.6562286Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:49.9115291Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:593: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:12:49.9116127Z world_indices[ 2022-09-27T16:12:49.9117320Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:593: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:12:49.9118085Z world_indices[ 2022-09-27T16:12:50.7105809Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:50.7106644Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:51.7708923Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:51.7709751Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:52.8247913Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:52.8248806Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:53.8781220Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:53.8782044Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:54.9303902Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:54.9304818Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:56.2072117Z ok (19.935s) 2022-09-27T16:12:56.2077596Z test_delayed_reduce_scatter_offload_false_no_shard (__main__.TestParityWithDDP) 2022-09-27T16:12:56.2091523Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35573 2022-09-27T16:12:56.2097848Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35574 2022-09-27T16:12:57.9091735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:12:57.9092217Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:12:57.9095010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:12:57.9096018Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:12:57.9222210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:12:57.9222973Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:12:57.9227258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:12:57.9228064Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:12:58.1704326Z dist init r=0, world=2 2022-09-27T16:12:58.1708615Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:12:58.1718666Z dist init r=1, world=2 2022-09-27T16:12:58.1724538Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:12:58.1726296Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:12:58.1812410Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:12:59.5417601Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:12:59.5418425Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:12:59.9971565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:12:59.9980137Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.0005480Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:00.0006285Z warnings.warn( 2022-09-27T16:13:00.0017329Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:00.0018095Z warnings.warn( 2022-09-27T16:13:00.0375964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:00.0376659Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:00.0388068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:00.0388742Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:00.0442655Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.0443726Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.0869588Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.0870672Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.1297562Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.1299060Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.1728147Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.1731004Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.2154653Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.2155756Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.2582716Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.2583779Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.3007176Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.3008256Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.3434605Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.3435681Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.3863667Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.3864723Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.4294542Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.4295610Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.4458184Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:13:00.4459149Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-09-27T16:13:00.4460368Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:13:00.4461231Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-09-27T16:13:00.4740908Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:00.4742296Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:01.0192467Z ok (4.812s) 2022-09-27T16:13:01.0197833Z test_delayed_reduce_scatter_offload_false_none (__main__.TestParityWithDDP) 2022-09-27T16:13:01.0201772Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82704 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-09-27T16:13:01.0206374Z test_delayed_reduce_scatter_offload_false_shard_grad_op (__main__.TestParityWithDDP) 2022-09-27T16:13:01.0210223Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82398 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-09-27T16:13:01.0214573Z test_delayed_reduce_scatter_offload_true_no_shard (__main__.TestParityWithDDP) 2022-09-27T16:13:01.0228559Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35658 2022-09-27T16:13:01.0235555Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35659 2022-09-27T16:13:02.6612329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:02.6612835Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:02.6615509Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:02.6615995Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:02.6925155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:02.6925623Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:02.6929872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:02.6930355Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:02.9234550Z dist init r=0, world=2 2022-09-27T16:13:02.9239301Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:13:02.9401507Z dist init r=1, world=2 2022-09-27T16:13:02.9407416Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:13:02.9408270Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:02.9443433Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:04.3141883Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:13:04.3142429Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:13:04.7467264Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.7467812Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.7499779Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:04.7500586Z warnings.warn( 2022-09-27T16:13:04.7501697Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:04.7502423Z warnings.warn( 2022-09-27T16:13:04.7601037Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.7601539Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.7646434Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.7647973Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.7721166Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.7721679Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.7838565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.7839241Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.7862529Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.7863805Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.7865009Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.7866237Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.7867444Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.7868644Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.7869870Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.7871328Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.7872584Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.7873957Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.7958736Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.7959248Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.8001882Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.8003849Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.8077785Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.8078286Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.8194669Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.8195149Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.8220299Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.8221848Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.8223066Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.8224275Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.8225529Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.8226734Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.8227944Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.8229282Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.8230524Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.8232006Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:04.8729710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:04.8730409Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:04.8732937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:04.8733597Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:04.8786952Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.8787437Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.9356701Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.9357186Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.9924863Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:04.9925336Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:05.0498016Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:05.0498500Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:05.1065802Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:05.1066305Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:05.1636606Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:05.1637147Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:05.1765890Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:951: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:13:05.1766832Z subtensor.view(shape) 2022-09-27T16:13:05.1768257Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:951: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:13:05.1769078Z subtensor.view(shape) 2022-09-27T16:13:05.2222773Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:05.2223327Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:05.2798829Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:05.2799327Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:05.3358969Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:05.3359462Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:05.3919904Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:05.3920426Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:05.4479797Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:05.4480292Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:13:06.0341007Z ok (5.013s) 2022-09-27T16:13:06.0345603Z test_delayed_reduce_scatter_offload_true_none (__main__.TestParityWithDDP) 2022-09-27T16:13:06.0349395Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82399 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-09-27T16:13:06.0354191Z test_delayed_reduce_scatter_offload_true_shard_grad_op (__main__.TestParityWithDDP) 2022-09-27T16:13:06.0357152Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82403 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-09-27T16:13:06.0375417Z test_mixture_of_experts_offload_false_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35743 2022-09-27T16:13:06.0381741Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35744 2022-09-27T16:13:07.6974735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:07.6975250Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:07.6977765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:07.6978263Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:07.7084229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:07.7084689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:07.7089231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:07.7089708Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:07.9570335Z dist init r=1, world=2 2022-09-27T16:13:07.9571567Z dist init r=0, world=2 2022-09-27T16:13:07.9575402Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:13:07.9577472Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:13:07.9578630Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:07.9678392Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:09.3343779Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:13:09.3344314Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:13:09.7819472Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:09.7820586Z warnings.warn( 2022-09-27T16:13:09.7851516Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-09-27T16:13:09.7911448Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:09.7912223Z warnings.warn( 2022-09-27T16:13:09.7945959Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-09-27T16:13:09.7946808Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:13:09.7954241Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:13:09.8611816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:09.8612528Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:09.8619183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:09.8619840Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:09.8715738Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-09-27T16:13:09.8718514Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-09-27T16:13:09.8719985Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:13:09.8745314Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:09.8746581Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:09.8748033Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:09.8818841Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:13:09.8843111Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:09.8844364Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:09.8845772Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:09.9565020Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-09-27T16:13:09.9565987Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-09-27T16:13:09.9566927Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:13:09.9592636Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:09.9593884Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:09.9595113Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:09.9665692Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:13:09.9690505Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:09.9691754Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:09.9692971Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.0414775Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-09-27T16:13:10.0416086Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-09-27T16:13:10.0416921Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:13:10.0445247Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.0446514Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.0447888Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.0516152Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:13:10.0541431Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.0542702Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.0543936Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.1637782Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-09-27T16:13:10.1638709Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-09-27T16:13:10.1640053Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:13:10.1667826Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.1669092Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.1670319Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.1739077Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:13:10.1765591Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.1766842Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.1768046Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.2480453Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-09-27T16:13:10.2481610Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-09-27T16:13:10.2482380Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:13:10.2510588Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.2512062Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.2513297Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.2581680Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:13:10.2608209Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.2609464Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.2610678Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.3337472Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-09-27T16:13:10.3338650Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-09-27T16:13:10.3339849Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:13:10.3367742Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.3369017Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.3370237Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.3439024Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:13:10.3463991Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.3465214Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.3466429Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.4187495Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-09-27T16:13:10.4189416Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-09-27T16:13:10.4190150Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:13:10.4218133Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.4219405Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.4220610Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.4288311Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:13:10.4313468Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.4314896Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.4316137Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.5037054Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-09-27T16:13:10.5038938Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-09-27T16:13:10.5040106Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:13:10.5067520Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.5068782Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.5070000Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.5137875Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:13:10.5163914Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.5165148Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.5166353Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.5895977Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-09-27T16:13:10.5896777Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:13:10.5897306Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-09-27T16:13:10.5897964Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:13:10.5922698Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.5924107Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.5925348Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.5926573Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.5927912Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.5929124Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6801473Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-09-27T16:13:10.6802649Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-09-27T16:13:10.6803563Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:13:10.6841807Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6843637Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6845470Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6847067Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6848440Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6849906Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6851687Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6853327Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6854549Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6855902Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6857375Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6858581Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6859797Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6861000Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6862218Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6863446Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6864657Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6865905Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6867119Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6868338Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6869533Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6871095Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6872320Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6873541Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6874741Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6875964Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6902620Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:13:10.6938422Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6939914Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6942138Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6944641Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6946591Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6947841Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6949187Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6950388Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6951831Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6953076Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6954271Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6955493Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6956691Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6957911Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6959108Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6960386Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6961586Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6962795Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6964073Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6965272Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6966477Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6967678Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6968888Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6970085Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6971296Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.6972477Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.7686334Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-09-27T16:13:10.7687226Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-09-27T16:13:10.7688555Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:13:10.7716355Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.7717633Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.7718837Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.7787426Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:13:10.7812914Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.7814154Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:10.7815377Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:11.4488212Z ok (5.413s) 2022-09-27T16:13:11.4507951Z test_mixture_of_experts_offload_false_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35972 2022-09-27T16:13:11.4514487Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35973 2022-09-27T16:13:13.0552461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:13.0552980Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:13.0555572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:13.0556072Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:13.0648265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:13.0648712Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:13.0652680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:13.0653160Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:13.3044183Z dist init r=1, world=2 2022-09-27T16:13:13.3047864Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:13:13.3207880Z dist init r=0, world=2 2022-09-27T16:13:13.3213405Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:13:13.3214225Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:13.3251426Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:14.6921459Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:13:14.6922465Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:13:15.1445071Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:15.1445910Z warnings.warn( 2022-09-27T16:13:15.1477478Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-09-27T16:13:15.1500319Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:15.1501092Z warnings.warn( 2022-09-27T16:13:15.1535789Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-09-27T16:13:15.1536485Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:13:15.1580145Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:13:15.2328475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:15.2329210Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:15.2330144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:15.2330784Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:15.2426952Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-09-27T16:13:15.2430146Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-09-27T16:13:15.2431402Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:13:15.2456292Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.2457548Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.2458781Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.2529685Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:13:15.2553358Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.2554593Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.2555966Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.3361711Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-09-27T16:13:15.3362675Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-09-27T16:13:15.3363807Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:13:15.3390555Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.3392053Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.3393290Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.3462434Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:13:15.3488207Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.3489456Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.3490688Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.4299501Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-09-27T16:13:15.4301803Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-09-27T16:13:15.4302793Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:13:15.4331117Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.4332367Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.4333599Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.4400730Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:13:15.4427227Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.4428472Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.4429698Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.5427436Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-09-27T16:13:15.5428847Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-09-27T16:13:15.5430252Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:13:15.5458631Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.5459898Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.5461129Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.5528858Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:13:15.5555909Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.5557180Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.5558396Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.6365898Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-09-27T16:13:15.6366849Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-09-27T16:13:15.6367715Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:13:15.6396099Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.6397349Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.6398575Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.6466826Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:13:15.6493793Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.6495017Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.6496252Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.7348645Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-09-27T16:13:15.7349946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-09-27T16:13:15.7351682Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:13:15.7379682Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.7381154Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.7382396Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.7449772Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:13:15.7477393Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.7478621Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.7479830Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.8326435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-09-27T16:13:15.8327722Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-09-27T16:13:15.8329754Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:13:15.8358080Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.8359323Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.8360551Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.8428458Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:13:15.8455527Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.8456931Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.8458185Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.9310874Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-09-27T16:13:15.9312212Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-09-27T16:13:15.9313701Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:13:15.9342081Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.9343336Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.9344563Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.9411865Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:13:15.9439358Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.9440587Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:15.9441797Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.0373970Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-09-27T16:13:16.0477627Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-09-27T16:13:16.0478388Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:13:16.0507293Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.0508536Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.0509922Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.0577302Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:13:16.0606072Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.0607465Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.0608692Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1420755Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-09-27T16:13:16.1421524Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-09-27T16:13:16.1422762Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:13:16.1462193Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1463775Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1465528Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1467163Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1468724Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1470288Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1472148Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1473801Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1475198Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1476546Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1477751Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1478952Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1480186Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1481384Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1482588Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1483789Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1484989Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1486191Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1522167Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:13:16.1560089Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1561756Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1563328Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1565036Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1566570Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1568069Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1569314Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1570509Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1571709Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1572899Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1574107Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1575298Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1576562Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1577769Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1578964Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1580230Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1581422Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.1582620Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.2410342Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-09-27T16:13:16.2411631Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-09-27T16:13:16.2412469Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:13:16.2442663Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.2444057Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.2445308Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.2511076Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:13:16.2540467Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.2541929Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.2543172Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:16.9621275Z ok (5.513s) 2022-09-27T16:13:16.9641486Z test_mixture_of_experts_offload_false_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36201 2022-09-27T16:13:16.9648422Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36202 2022-09-27T16:13:18.5923598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:18.5924146Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:18.5926455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:18.5926943Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:18.6082010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:18.6082475Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:18.6086680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:18.6087166Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:18.8572235Z dist init r=0, world=2 2022-09-27T16:13:18.8575608Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:13:18.8629390Z dist init r=1, world=2 2022-09-27T16:13:18.8635170Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:13:18.8636045Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:18.8678908Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:20.2304057Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:13:20.2304579Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:13:20.7167470Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:20.7168298Z warnings.warn( 2022-09-27T16:13:20.7199542Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-09-27T16:13:20.7302314Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:20.7303090Z warnings.warn( 2022-09-27T16:13:20.7335156Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-09-27T16:13:20.7336056Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:13:20.7403278Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:13:20.8141209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:20.8141936Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:20.8142857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:20.8143824Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:20.8236695Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-09-27T16:13:20.8237197Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-09-27T16:13:20.8237873Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:13:20.8238555Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:13:20.8259934Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:20.8261204Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:20.8262436Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:20.8263664Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:20.8264870Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:20.8266089Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:20.9057227Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-09-27T16:13:20.9057769Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-09-27T16:13:20.9058522Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:13:20.9059402Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:13:20.9081768Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:20.9083019Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:20.9084390Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:20.9085612Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:20.9086796Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:20.9088033Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:20.9882996Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-09-27T16:13:20.9883539Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-09-27T16:13:20.9884269Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:13:20.9884942Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:13:20.9907820Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:20.9909096Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:20.9910312Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:20.9911909Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:20.9913323Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:20.9914559Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.0980456Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-09-27T16:13:21.0981172Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-09-27T16:13:21.0981940Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:13:21.0982614Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:13:21.1006351Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.1007606Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.1008831Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.1010053Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.1011260Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.1012480Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.1802656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-09-27T16:13:21.1803187Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-09-27T16:13:21.1803928Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:13:21.1804617Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:13:21.1828245Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.1829528Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.1831040Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.1832310Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.1833640Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.1834839Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.2631468Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-09-27T16:13:21.2632031Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-09-27T16:13:21.2632803Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:13:21.2633487Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:13:21.2657069Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.2658338Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.2659578Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.2660784Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.2661999Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.2663420Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.3456929Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-09-27T16:13:21.3457458Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-09-27T16:13:21.3458213Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:13:21.3458899Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:13:21.3482201Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.3483454Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.3484653Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.3485886Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.3487103Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.3488303Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.4427507Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-09-27T16:13:21.4428074Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-09-27T16:13:21.4428818Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:13:21.4429502Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:13:21.4453642Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.4455097Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.4456343Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.4457546Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.4458768Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.4460082Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.5246361Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-09-27T16:13:21.5246937Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-09-27T16:13:21.5247694Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:13:21.5248397Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:13:21.5271161Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.5272447Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.5273659Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.5274891Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.5276090Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.5277308Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6070122Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-09-27T16:13:21.6070908Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-09-27T16:13:21.6071678Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:13:21.6072364Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:13:21.6105933Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6107360Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6108564Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6109779Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6111272Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6112482Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6113700Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6114925Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6116150Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6117355Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6118642Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6119858Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6121054Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6122342Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6123582Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6124770Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6125983Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6127195Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6128394Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6129605Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6130809Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6132010Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6133260Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6134459Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6135660Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6136926Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6138126Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6139324Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6140530Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6141731Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6142939Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6144139Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6145323Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6146526Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6147774Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6148978Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6953844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-09-27T16:13:21.6954378Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-09-27T16:13:21.6955139Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:13:21.6956029Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:13:21.6982063Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6983803Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6985026Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6986259Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6987479Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:21.6988692Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:22.3752274Z ok (5.413s) 2022-09-27T16:13:22.3771672Z test_mixture_of_experts_offload_true_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36430 2022-09-27T16:13:22.3778532Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36431 2022-09-27T16:13:24.0445230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:24.0445748Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:24.0448634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:24.0449122Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:24.0641633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:24.0642335Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:24.0647005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:24.0647543Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:24.3130636Z dist init r=0, world=2 2022-09-27T16:13:24.3134457Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:13:24.3196586Z dist init r=1, world=2 2022-09-27T16:13:24.3202037Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:13:24.3203112Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:24.3237104Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:25.6754221Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:13:25.6754848Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:13:26.1045761Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:26.1046586Z warnings.warn( 2022-09-27T16:13:26.1077353Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-09-27T16:13:26.1140047Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:26.1140823Z warnings.warn( 2022-09-27T16:13:26.1175314Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-09-27T16:13:26.1176298Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:13:26.1180206Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:13:26.1291431Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.1297869Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.1303599Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-09-27T16:13:26.1311121Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-09-27T16:13:26.1312104Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:13:26.1406140Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:13:26.1517733Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.1524384Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.1527197Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-09-27T16:13:26.1534624Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-09-27T16:13:26.1535478Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:13:26.1629991Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:13:26.1745144Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.1750548Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-09-27T16:13:26.1751799Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.1758507Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-09-27T16:13:26.1759188Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:13:26.1853329Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:13:26.1972173Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-09-27T16:13:26.1979056Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-09-27T16:13:26.1979931Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:13:26.1983058Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.2075091Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:13:26.2076822Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.2195383Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-09-27T16:13:26.2202144Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-09-27T16:13:26.2202964Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:13:26.2211065Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.2297843Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:13:26.2304979Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.2426993Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-09-27T16:13:26.2435696Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-09-27T16:13:26.2436375Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:13:26.2447224Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.2529659Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:13:26.2539598Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.3357007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:26.3357733Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:26.3361742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:26.3362411Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:26.3463571Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-09-27T16:13:26.3470392Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-09-27T16:13:26.3471834Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:13:26.3499393Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.3500629Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.3502050Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.3566341Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:13:26.3591373Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.3592799Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.3594177Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.4486101Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-09-27T16:13:26.4491895Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-09-27T16:13:26.4492679Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:13:26.4519844Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.4521138Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.4522346Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.4586767Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:13:26.4611486Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.4612728Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.4613939Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.5501707Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-09-27T16:13:26.5508141Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-09-27T16:13:26.5509139Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:13:26.5538889Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.5540131Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.5541523Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.5602812Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:13:26.5629840Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.5631499Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.5632741Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6840176Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-09-27T16:13:26.6846046Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-09-27T16:13:26.6847000Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:13:26.6889689Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6891158Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6892551Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6893924Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6895154Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6896356Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6897683Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6898889Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6900102Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6901764Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6904036Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6905271Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6906477Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6907685Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6908882Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6910161Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6911995Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6913204Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6914535Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6915739Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6916950Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6918151Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6919351Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6920534Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6921745Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6922943Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6924187Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6925453Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6926656Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6927860Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6929130Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6930322Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6931507Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6932709Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6933912Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6935132Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6936341Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6937539Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6938729Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6939980Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6941999Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6943386Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6944373Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:13:26.6981434Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6984268Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6986864Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6989687Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6992599Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6994741Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6995944Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6997156Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6998369Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.6999694Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7000915Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7002120Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7003406Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7004612Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7005823Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7007020Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7008202Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7009392Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7010596Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7011795Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7012982Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7014249Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7015450Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7016641Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7017891Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7019074Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7020266Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7021458Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7022654Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7023893Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7025102Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7026288Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7027481Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7028704Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7029918Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7031468Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7032783Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7033977Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7035163Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7036368Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7037552Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.7038752Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.8020500Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-09-27T16:13:26.8023762Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-09-27T16:13:26.8025296Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:13:26.8055566Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.8056916Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.8058338Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.8120485Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:13:26.8147925Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.8149193Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.8150545Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.9026163Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-09-27T16:13:26.9030169Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-09-27T16:13:26.9031813Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-09-27T16:13:26.9056609Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.9057889Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.9059091Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.9127712Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-09-27T16:13:26.9150268Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.9151787Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:26.9153010Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.0026138Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-09-27T16:13:27.0030328Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-09-27T16:13:27.0032398Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-09-27T16:13:27.0056813Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.0058098Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.0059531Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.0127642Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-09-27T16:13:27.0149991Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.0151408Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.0152634Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.1031973Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-09-27T16:13:27.1035895Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-09-27T16:13:27.1037230Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-09-27T16:13:27.1061894Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.1063167Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.1064387Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.1133017Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-09-27T16:13:27.1155999Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.1157257Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.1158471Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.2029202Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-09-27T16:13:27.2034210Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-09-27T16:13:27.2035642Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-09-27T16:13:27.2059942Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.2061244Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.2062460Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.2130562Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-09-27T16:13:27.2153493Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.2154779Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.2156001Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.3033357Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-09-27T16:13:27.3038564Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-09-27T16:13:27.3039924Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-09-27T16:13:27.3064012Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.3065297Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.3066516Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.3134545Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-09-27T16:13:27.3159909Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.3161139Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.3162369Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.4051484Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-09-27T16:13:27.4054323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-09-27T16:13:27.4055719Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-09-27T16:13:27.4080534Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.4081828Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.4083031Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.4152314Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-09-27T16:13:27.4175631Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.4176918Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:27.4178129Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:28.0891131Z ok (5.714s) 2022-09-27T16:13:28.0911268Z test_mixture_of_experts_offload_true_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36683 2022-09-27T16:13:28.0917343Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36684 2022-09-27T16:13:29.7582569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:29.7583066Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:29.7586325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:29.7586788Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:29.7707168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:29.7707666Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:29.7711562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:29.7712045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:30.0194905Z dist init r=1, world=2 2022-09-27T16:13:30.0212792Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:13:30.0242114Z dist init r=0, world=2 2022-09-27T16:13:30.0248042Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:13:30.0249058Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:30.0316615Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:31.3948170Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:13:31.3948724Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:13:31.8246165Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:31.8246962Z warnings.warn( 2022-09-27T16:13:31.8277989Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-09-27T16:13:31.8337839Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:31.8338594Z warnings.warn( 2022-09-27T16:13:31.8373304Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-09-27T16:13:31.8374013Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:13:31.8380423Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:13:31.8494214Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:31.8500804Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:31.8507576Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-09-27T16:13:31.8515008Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-09-27T16:13:31.8515692Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:13:31.8610496Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:13:31.8727144Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:372: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:13:31.8728101Z p.detach().reshape(-1) if isinstance(p, nn.Parameter) else p.reshape(-1) 2022-09-27T16:13:31.8733902Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:372: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:13:31.8734863Z p.detach().reshape(-1) if isinstance(p, nn.Parameter) else p.reshape(-1) 2022-09-27T16:13:31.8737947Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-09-27T16:13:31.8745262Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-09-27T16:13:31.8746421Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:13:31.8841102Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:13:31.8958906Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:31.8964824Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-09-27T16:13:31.8966235Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:31.8971972Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-09-27T16:13:31.8973039Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:13:31.9067571Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:13:31.9188341Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-09-27T16:13:31.9196059Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-09-27T16:13:31.9197343Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:13:31.9200648Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:31.9291432Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:13:31.9292987Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:31.9413778Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-09-27T16:13:31.9421284Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-09-27T16:13:31.9422406Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:13:31.9431026Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:31.9516264Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:13:31.9523944Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:31.9648018Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-09-27T16:13:31.9656541Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-09-27T16:13:31.9657430Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:13:31.9668088Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:31.9750480Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:13:31.9761105Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.0674844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:32.0675529Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:32.0677193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:32.0678087Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:32.0780261Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-09-27T16:13:32.0786886Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-09-27T16:13:32.0787729Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:13:32.0817309Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.0818561Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.0819768Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.0882798Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:13:32.0909151Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.0910408Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.0911906Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.1954244Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-09-27T16:13:32.2016098Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-09-27T16:13:32.2017136Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:13:32.2046355Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.2047648Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.2048864Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.2055881Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:13:32.2082722Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.2083969Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.2085191Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.3034032Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-09-27T16:13:32.3040312Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-09-27T16:13:32.3041105Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:13:32.3071157Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.3072472Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.3073683Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.3134948Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:13:32.3162199Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.3163672Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.3164914Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4353984Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-09-27T16:13:32.4360586Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-09-27T16:13:32.4361602Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:13:32.4404752Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4406093Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4408010Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4409554Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4411259Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4412763Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4414000Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4415210Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4416427Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4417798Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4419032Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4420228Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4421510Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4422724Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4423980Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4425190Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4426398Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4427597Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4428812Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4430011Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4431494Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4432794Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4434005Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4435207Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4436493Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4437702Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4438898Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4440107Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4441281Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4442483Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4443677Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4444868Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4446070Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4447319Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4448535Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4449727Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4450983Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4452161Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4453371Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4454583Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4455787Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4456989Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4457884Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:13:32.4494177Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4495716Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4497469Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4498968Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4500180Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4501569Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4503080Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4504289Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4505495Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4506688Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4507890Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4509096Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4510289Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4511697Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4512973Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4514178Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4515382Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4516578Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4517894Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4519079Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4520284Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4521462Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4522657Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4523901Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4525101Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4526294Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4527536Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4528761Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4529951Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4531155Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4532381Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4533584Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4534778Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4535967Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4537159Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4538348Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4539551Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4540743Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4541989Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4543194Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4544397Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.4545598Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.5455871Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-09-27T16:13:32.5461482Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-09-27T16:13:32.5462618Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:13:32.5495582Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.5496874Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.5498113Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.5557065Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:13:32.5586344Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.5587578Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.5588804Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.6522152Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-09-27T16:13:32.6526433Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-09-27T16:13:32.6527367Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-09-27T16:13:32.6553921Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.6555201Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.6556424Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.6623429Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-09-27T16:13:32.6648167Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.6649418Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.6650634Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.7585017Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-09-27T16:13:32.7589267Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-09-27T16:13:32.7590264Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-09-27T16:13:32.7617094Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.7618388Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.7619619Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.7685852Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-09-27T16:13:32.7710464Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.7712102Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.7713325Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.8646520Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-09-27T16:13:32.8651684Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-09-27T16:13:32.8652552Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-09-27T16:13:32.8679450Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.8680701Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.8681926Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.8747524Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-09-27T16:13:32.8772617Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.8773857Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.8775090Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.9701602Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-09-27T16:13:32.9705659Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-09-27T16:13:32.9706916Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-09-27T16:13:32.9733801Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.9735434Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.9736678Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.9802235Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-09-27T16:13:32.9826706Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.9827962Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:32.9829189Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:33.0753978Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-09-27T16:13:33.0758692Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-09-27T16:13:33.0759669Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-09-27T16:13:33.0784380Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:33.0785642Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:33.0786881Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:33.0854807Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-09-27T16:13:33.0877769Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:33.0879017Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:33.0880439Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:33.1806448Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-09-27T16:13:33.1810697Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-09-27T16:13:33.1811497Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-09-27T16:13:33.1836739Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:33.1838187Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:33.1839406Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:33.1907326Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-09-27T16:13:33.1930318Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:33.1931569Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:33.1932773Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:33.9035365Z ok (5.814s) 2022-09-27T16:13:33.9055655Z test_mixture_of_experts_offload_true_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36936 2022-09-27T16:13:33.9062462Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36937 2022-09-27T16:13:35.5972656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:35.5973163Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:35.5976325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:35.5976795Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:35.6289280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:35.6289760Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:35.6294565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:35.6295055Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:35.8653526Z dist init r=1, world=2 2022-09-27T16:13:35.8658154Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:13:35.8752711Z dist init r=0, world=2 2022-09-27T16:13:35.8758258Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:13:35.8759468Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:35.8760748Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:37.2486742Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:13:37.2487263Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:13:37.7075901Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:37.7076708Z warnings.warn( 2022-09-27T16:13:37.7097282Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:37.7098048Z warnings.warn( 2022-09-27T16:13:37.7110114Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-09-27T16:13:37.7131515Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-09-27T16:13:37.7132200Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:13:37.7212962Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:13:37.7333691Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:37.7334941Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:37.7346269Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-09-27T16:13:37.7347952Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-09-27T16:13:37.7348672Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:13:37.7448580Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:13:37.7570871Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:372: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:13:37.7571867Z p.detach().reshape(-1) if isinstance(p, nn.Parameter) else p.reshape(-1) 2022-09-27T16:13:37.7573106Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:372: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:13:37.7574140Z p.detach().reshape(-1) if isinstance(p, nn.Parameter) else p.reshape(-1) 2022-09-27T16:13:37.7581452Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-09-27T16:13:37.7581944Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-09-27T16:13:37.7582610Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:13:37.7583472Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:13:37.7708913Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:37.7710262Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:37.7714585Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-09-27T16:13:37.7716029Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-09-27T16:13:37.7716813Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:13:37.7817071Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:13:37.7945247Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-09-27T16:13:37.7946035Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-09-27T16:13:37.7946716Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:13:37.7947694Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:13:37.7949011Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:37.7950580Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:37.8079650Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-09-27T16:13:37.8080814Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-09-27T16:13:37.8081646Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:13:37.8089379Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:37.8181879Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:13:37.8189626Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:37.8322161Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-09-27T16:13:37.8322710Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-09-27T16:13:37.8323378Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:13:37.8324042Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:13:37.8333698Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:37.8334928Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:37.9211751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:37.9212458Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:37.9213406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:37.9214067Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:37.9321286Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-09-27T16:13:37.9321791Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-09-27T16:13:37.9322460Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:13:37.9323185Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:13:37.9350933Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:37.9352238Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:37.9353452Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:37.9354679Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:37.9355991Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:37.9357189Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.0298245Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-09-27T16:13:38.0298783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-09-27T16:13:38.0299532Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:13:38.0300215Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:13:38.0325864Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.0327122Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.0328360Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.0329573Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.0330773Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.0332177Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.1281604Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-09-27T16:13:38.1282136Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-09-27T16:13:38.1282890Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:13:38.1283557Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:13:38.1310665Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.1312112Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.1313327Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.1314555Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.1315774Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.1316982Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2558888Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-09-27T16:13:38.2559440Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-09-27T16:13:38.2560215Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:13:38.2560895Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:13:38.2600346Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2601633Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2603085Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2604328Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2605529Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2606879Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2608076Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2609288Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2610488Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2611669Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2612877Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2614084Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2615281Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2616475Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2617773Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2618977Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2620182Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2621458Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2622707Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2623904Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2625110Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2626289Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2627490Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2628699Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2629891Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2631300Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2632597Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2633816Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2635008Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2636266Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2637462Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2638665Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2639870Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2641061Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2642256Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2643454Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2644647Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2645837Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2647082Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2648299Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2649502Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2650760Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2651957Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2653156Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2654359Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2655551Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2656734Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2657948Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2659148Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2660342Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2661594Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2662794Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2663992Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2665252Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2666450Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2667624Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2668827Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2670021Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2671444Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2672662Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2673853Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2675046Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2676315Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2677516Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2678707Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2679984Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2681178Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2682363Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2683561Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2684750Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2685945Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2687138Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2688322Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2689511Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2690766Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2691979Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2693168Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2694420Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2695608Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2696800Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2697999Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2699183Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2700374Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.2701568Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.3563792Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-09-27T16:13:38.3564323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-09-27T16:13:38.3565065Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:13:38.3565737Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:13:38.3595526Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.3596797Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.3598000Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.3599312Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.3600510Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.3601720Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.4534338Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-09-27T16:13:38.4535425Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-09-27T16:13:38.4536911Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-09-27T16:13:38.4538324Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-09-27T16:13:38.4560634Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.4563335Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.4565912Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.4568325Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.4570790Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.4573578Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.5499593Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-09-27T16:13:38.5500645Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-09-27T16:13:38.5502133Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-09-27T16:13:38.5503550Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-09-27T16:13:38.5526595Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.5529272Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.5531759Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.5534285Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.5536687Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.5539214Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.6471885Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-09-27T16:13:38.6539486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-09-27T16:13:38.6540705Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-09-27T16:13:38.6568124Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.6570624Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.6573384Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.6575186Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-09-27T16:13:38.6603912Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.6606434Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.6609026Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.7683063Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-09-27T16:13:38.7684152Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-09-27T16:13:38.7685644Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-09-27T16:13:38.7687088Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-09-27T16:13:38.7710018Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.7713089Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.7715580Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.7718102Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.7720456Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.7723144Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.8649438Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-09-27T16:13:38.8650526Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-09-27T16:13:38.8652023Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-09-27T16:13:38.8653459Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-09-27T16:13:38.8673395Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.8676226Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.8678851Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.8681408Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.8683810Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.8686314Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.9611036Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-09-27T16:13:38.9612073Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-09-27T16:13:38.9613576Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-09-27T16:13:38.9614976Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-09-27T16:13:38.9635283Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.9637793Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.9640724Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.9643379Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.9645878Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:38.9648331Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:39.6175964Z ok (5.714s) 2022-09-27T16:13:39.6194322Z test_mixture_of_experts_with_delay_before_free_offload_false_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37189 2022-09-27T16:13:39.6200294Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37190 2022-09-27T16:13:41.2452532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:41.2453038Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:41.2455492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:41.2455998Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:41.2660843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:41.2661316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:41.2666021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:41.2666500Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:41.5091781Z dist init r=0, world=2 2022-09-27T16:13:41.5096199Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:13:41.5181587Z dist init r=1, world=2 2022-09-27T16:13:41.5186978Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:13:41.5188096Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:41.5198914Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:42.8782445Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:13:42.8782983Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:13:43.3195596Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:43.3196434Z warnings.warn( 2022-09-27T16:13:43.3227756Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-09-27T16:13:43.3263807Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:43.3264593Z warnings.warn( 2022-09-27T16:13:43.3299032Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-09-27T16:13:43.3299740Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:13:43.3330060Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:13:43.8171589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:43.8172295Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:43.8174006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:43.8174670Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:43.8270337Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-09-27T16:13:43.8274856Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-09-27T16:13:43.8275587Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:13:43.8301041Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:43.8302298Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:43.8303519Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:43.8373292Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:13:43.8397128Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:43.8398365Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:43.8399786Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:44.2376281Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-09-27T16:13:44.2377505Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-09-27T16:13:44.2378292Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:13:44.2404765Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:44.2406249Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:44.2407474Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:44.2476983Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:13:44.2501385Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:44.2502628Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:44.2503844Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:44.6474242Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-09-27T16:13:44.6477276Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-09-27T16:13:44.6478072Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:13:44.6506097Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:44.6507371Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:44.6508822Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:44.6575525Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:13:44.6602014Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:44.6603259Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:44.6604630Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.0575364Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-09-27T16:13:45.0578211Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-09-27T16:13:45.0578991Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:13:45.0606786Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.0608074Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.0609303Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.0676106Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:13:45.0702411Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.0703662Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.0704878Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.4675816Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-09-27T16:13:45.4678657Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-09-27T16:13:45.4679639Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:13:45.4707727Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.4709064Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.4710470Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.4776982Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:13:45.4804185Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.4805445Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.4806685Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.8786266Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-09-27T16:13:45.8789945Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-09-27T16:13:45.8790940Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:13:45.8818599Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.8819888Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.8821115Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.8887496Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:13:45.8913668Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.8914937Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:45.8916153Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:46.2898562Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-09-27T16:13:46.2901854Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-09-27T16:13:46.2902698Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:13:46.2931296Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:46.2932553Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:46.2933805Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:46.3000493Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:13:46.3026221Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:46.3027473Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:46.3028683Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:46.7010479Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-09-27T16:13:46.7013632Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-09-27T16:13:46.7014468Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:13:46.7042986Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:46.7044248Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:46.7045489Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:46.7111840Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:13:46.7138245Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:46.7139496Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:46.7140723Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.1119485Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-09-27T16:13:47.1123465Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-09-27T16:13:47.1124239Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:13:47.1152025Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.1153286Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.1154539Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.1220500Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:13:47.1247135Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.1248589Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.1249821Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5235109Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-09-27T16:13:47.5237283Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-09-27T16:13:47.5238476Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:13:47.5276758Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5278050Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5279287Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5280789Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5282468Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5283805Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5285178Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5286465Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5287812Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5289432Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5290669Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5291871Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5293141Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5294349Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5295552Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5296762Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5297951Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5299153Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5300377Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5301571Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5302776Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5304023Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5305243Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5306441Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5307701Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5336534Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:13:47.5371934Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5373175Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5374406Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5375619Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5376847Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5378058Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5379266Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5380473Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5381800Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5383326Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5384530Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5385834Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5387053Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5388252Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5389456Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5390657Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5392191Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5393386Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5394591Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5395787Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5397071Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5398280Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5399471Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5400745Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.5401951Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.9365556Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-09-27T16:13:47.9369122Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-09-27T16:13:47.9369915Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:13:47.9397988Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.9399262Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.9400505Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.9466479Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:13:47.9493040Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.9494265Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:47.9495693Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:48.9368384Z ok (9.319s) 2022-09-27T16:13:48.9386543Z test_mixture_of_experts_with_delay_before_free_offload_false_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37418 2022-09-27T16:13:48.9392699Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37419 2022-09-27T16:13:50.5868863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:50.5869859Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:50.5871796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:50.5872310Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:50.6263801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:13:50.6264265Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:13:50.6268709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:13:50.6269173Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:13:50.8395997Z dist init r=0, world=2 2022-09-27T16:13:50.8400433Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:13:50.8703094Z dist init r=1, world=2 2022-09-27T16:13:50.8708497Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:13:50.8709656Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:50.8807185Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:13:52.2383341Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:13:52.2383871Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:13:52.6634575Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:52.6635381Z warnings.warn( 2022-09-27T16:13:52.6666215Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-09-27T16:13:52.6729307Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:13:52.6730067Z warnings.warn( 2022-09-27T16:13:52.6764232Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-09-27T16:13:52.6765205Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:13:52.6768631Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:13:53.1668327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:53.1669071Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:53.1669993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:13:53.1670653Z warnings.warn(msg, FutureWarning) 2022-09-27T16:13:53.1764076Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-09-27T16:13:53.1769236Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-09-27T16:13:53.1770248Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:13:53.1796680Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:53.1797924Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:53.1799152Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:53.1866731Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:13:53.1891422Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:53.1892667Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:53.1893901Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:53.6240183Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-09-27T16:13:53.6243365Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-09-27T16:13:53.6244457Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:13:53.6270522Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:53.6272295Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:53.6273553Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:53.6341636Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:13:53.6366359Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:53.6367601Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:53.6368826Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.0718593Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-09-27T16:13:54.0721239Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-09-27T16:13:54.0722059Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:13:54.0749543Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.0751041Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.0752309Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.0819537Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:13:54.0845522Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.0846771Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.0848214Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.5195378Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-09-27T16:13:54.5199948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-09-27T16:13:54.5201026Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:13:54.5228490Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.5230020Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.5231586Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.5297048Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:13:54.5323097Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.5324343Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.5325853Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.9678095Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-09-27T16:13:54.9679318Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-09-27T16:13:54.9680516Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:13:54.9708189Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.9709437Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.9711078Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.9779043Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:13:54.9805134Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.9806376Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:54.9807753Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:55.4164712Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-09-27T16:13:55.4170316Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-09-27T16:13:55.4171544Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:13:55.4201116Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:55.4202406Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:55.4203642Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:55.4266053Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:13:55.4292313Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:55.4293565Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:55.4294780Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:55.8649245Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-09-27T16:13:55.8652433Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-09-27T16:13:55.8653632Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:13:55.8682299Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:55.8683587Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:55.8685047Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:55.8750814Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:13:55.8777015Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:55.8778299Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:55.8779533Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:56.3142350Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-09-27T16:13:56.3146350Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-09-27T16:13:56.3147167Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:13:56.3175405Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:56.3176702Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:56.3177913Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:56.3244050Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:13:56.3269939Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:56.3271490Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:56.3272730Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:56.7622392Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-09-27T16:13:56.7625512Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-09-27T16:13:56.7626402Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:13:56.7655124Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:56.7656440Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:56.7657681Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:56.7724343Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:13:56.7750199Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:56.7751759Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:56.7752974Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2112548Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-09-27T16:13:57.2114274Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-09-27T16:13:57.2115566Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:13:57.2154492Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2155800Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2157037Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2158364Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2159606Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2160823Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2162043Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2163248Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2164451Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2165660Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2166866Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2168063Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2169314Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2170525Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2171721Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2172987Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2174182Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2175386Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2213976Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:13:57.2249508Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2250932Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2252146Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2253351Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2254554Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2255876Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2257123Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2258327Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2259595Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2260798Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2262004Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2263207Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2264415Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2265605Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2266817Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2268012Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2269212Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.2270448Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.6619445Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-09-27T16:13:57.6620366Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-09-27T16:13:57.6621405Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:13:57.6652087Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.6653569Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.6654790Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.6720521Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:13:57.6748380Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.6749619Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:57.6751088Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:13:58.6576448Z ok (9.721s) 2022-09-27T16:13:58.6594877Z test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37647 2022-09-27T16:13:58.6600729Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37648 2022-09-27T16:14:00.3871939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:14:00.3872475Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:14:00.3875261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:14:00.3875748Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:14:00.4024381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:14:00.4024853Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:14:00.4029277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:14:00.4029769Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:14:00.6507113Z dist init r=1, world=2 2022-09-27T16:14:00.6509310Z dist init r=0, world=2 2022-09-27T16:14:00.6511284Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:14:00.6516065Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:14:00.6516903Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:14:00.6614162Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:14:02.0133701Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:14:02.0134219Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:14:02.4483936Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:14:02.4484736Z warnings.warn( 2022-09-27T16:14:02.4499184Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:14:02.4499960Z warnings.warn( 2022-09-27T16:14:02.4516208Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-09-27T16:14:02.4533696Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-09-27T16:14:02.4534505Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:14:02.4618893Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:14:02.9961813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:14:02.9962530Z warnings.warn(msg, FutureWarning) 2022-09-27T16:14:02.9963486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:14:02.9964145Z warnings.warn(msg, FutureWarning) 2022-09-27T16:14:03.0057147Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-09-27T16:14:03.0057661Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-09-27T16:14:03.0058350Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:14:03.0059258Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:14:03.0084377Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:03.0085646Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:03.0086881Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:03.0088103Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:03.0089433Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:03.0090647Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:03.5392872Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-09-27T16:14:03.5393440Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-09-27T16:14:03.5394230Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:14:03.5394923Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:14:03.5418711Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:03.5419991Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:03.5421250Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:03.5422471Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:03.5423691Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:03.5425109Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:04.0730064Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-09-27T16:14:04.0730609Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-09-27T16:14:04.0731394Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:14:04.0732074Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:14:04.0757134Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:04.0758405Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:04.0759609Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:04.0760837Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:04.0762048Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:04.0763263Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:04.6066209Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-09-27T16:14:04.6066999Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-09-27T16:14:04.6067775Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:14:04.6068466Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:14:04.6092602Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:04.6093922Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:04.6095355Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:04.6096585Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:04.6097796Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:04.6099090Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:05.1403180Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-09-27T16:14:05.1403751Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-09-27T16:14:05.1404535Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:14:05.1405215Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:14:05.1429252Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:05.1430503Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:05.1432111Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:05.1433334Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:05.1434559Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:05.1435766Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:05.6748905Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-09-27T16:14:05.6749672Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-09-27T16:14:05.6750476Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:14:05.6751353Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:14:05.6776439Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:05.6778054Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:05.6779278Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:05.6780492Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:05.6781722Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:05.6782940Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:06.2096193Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-09-27T16:14:06.2096811Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-09-27T16:14:06.2097600Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:14:06.2098287Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:14:06.2123392Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:06.2124690Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:06.2125916Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:06.2127350Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:06.2128604Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:06.2129815Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:06.7441767Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-09-27T16:14:06.7442914Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-09-27T16:14:06.7443694Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:14:06.7444382Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:14:06.7468125Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:06.7470057Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:06.7471685Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:06.7472908Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:06.7474131Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:06.7475322Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.2782914Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-09-27T16:14:07.2784080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-09-27T16:14:07.2784940Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:14:07.2785790Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:14:07.2809957Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.2811933Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.2813166Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.2814527Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.2815745Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.2816951Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8124213Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-09-27T16:14:07.8125378Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-09-27T16:14:07.8126214Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:14:07.8126910Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:14:07.8160922Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8163787Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8166507Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8169218Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8171131Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8172371Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8173573Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8174895Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8176107Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8177313Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8178529Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8179743Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8180942Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8182155Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8183338Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8184567Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8185899Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8187111Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8188310Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8189565Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8191070Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8192319Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8193543Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8194754Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8195943Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8197158Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8198356Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8199554Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8200845Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8202067Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8203267Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8204553Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8205733Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8206934Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8208141Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:07.8209342Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:08.3486365Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-09-27T16:14:08.3487524Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-09-27T16:14:08.3488386Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:14:08.3489066Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:14:08.3516352Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:08.3517648Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:08.3519047Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:08.3520287Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:08.3521548Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:08.3522876Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:09.4798712Z ok (10.822s) 2022-09-27T16:14:09.4818343Z test_mixture_of_experts_with_delay_before_free_offload_true_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37876 2022-09-27T16:14:09.4824690Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37877 2022-09-27T16:14:11.0958305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:14:11.0958796Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:14:11.0961527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:14:11.0962282Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:14:11.1348351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:14:11.1348827Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:14:11.1353029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:14:11.1353761Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:14:11.3491586Z dist init r=1, world=2 2022-09-27T16:14:11.3495571Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:14:11.3823435Z dist init r=0, world=2 2022-09-27T16:14:11.3828683Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:14:11.3829547Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:14:11.3902015Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:14:12.7565043Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:14:12.7565580Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:14:13.1872331Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:14:13.1873390Z warnings.warn( 2022-09-27T16:14:13.1903924Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-09-27T16:14:13.1958540Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:14:13.1959333Z warnings.warn( 2022-09-27T16:14:13.1993381Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-09-27T16:14:13.1994474Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:14:13.2006503Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:14:13.2121845Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:13.2124263Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:13.2134456Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-09-27T16:14:13.2137425Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-09-27T16:14:13.2138097Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:14:13.2237357Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:14:13.2353764Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:13.2355566Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:13.2364159Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-09-27T16:14:13.2365199Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-09-27T16:14:13.2365994Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:14:13.2466719Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:14:13.2585919Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:13.2588284Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:13.2591926Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-09-27T16:14:13.2594505Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-09-27T16:14:13.2595336Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:14:13.2694493Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:14:13.2818334Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-09-27T16:14:13.2820751Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-09-27T16:14:13.2821434Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:14:13.2823905Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:13.2920782Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:14:13.2922882Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:13.3045702Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-09-27T16:14:13.3048784Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-09-27T16:14:13.3049480Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:14:13.3057311Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:13.3148142Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:14:13.3155875Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:13.3281940Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-09-27T16:14:13.3285521Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-09-27T16:14:13.3286211Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:14:13.3297001Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:13.3384499Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:14:13.3395220Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:13.9248181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:14:13.9248898Z warnings.warn(msg, FutureWarning) 2022-09-27T16:14:13.9251897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:14:13.9252806Z warnings.warn(msg, FutureWarning) 2022-09-27T16:14:13.9357902Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-09-27T16:14:13.9360660Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-09-27T16:14:13.9361704Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:14:13.9460924Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:14:14.5329279Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-09-27T16:14:14.5346383Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-09-27T16:14:14.5347370Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:14:14.5430693Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:14:15.1300200Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-09-27T16:14:15.1300749Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-09-27T16:14:15.1301995Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:14:15.1400995Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:14:15.7476348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-09-27T16:14:15.7477198Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-09-27T16:14:15.7478280Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:14:15.7501358Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7502683Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7503921Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7505338Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7506592Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7507814Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7509123Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7510326Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7511886Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7513087Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7514322Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7515515Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7516763Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7517970Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7519164Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7520443Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7521714Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7522921Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7524204Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7525406Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7526606Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7527817Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7529015Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7530223Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7531430Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7532620Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7533813Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7535070Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7536285Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7537487Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7538735Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7539943Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7541139Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7542341Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7543548Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7544742Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7545946Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7547137Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7548321Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7549564Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7551151Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7552385Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7553686Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7554887Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7556093Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7557297Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7558502Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7559679Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7578188Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:14:15.7599984Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7601226Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7602552Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7603788Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7604986Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7606194Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7607467Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7608650Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7609852Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7611072Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7612267Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7613455Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7614654Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7615845Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7617103Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7618318Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7619497Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7620741Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7622016Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7623213Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7624405Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7625609Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7626800Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7628001Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7629198Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7630385Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7631967Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7633180Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7634381Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7635573Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7636846Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7638034Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7639237Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7640419Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7641615Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7642815Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7644023Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7645229Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7646494Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7647709Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7648904Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7650109Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7651354Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7652554Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7653751Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7654961Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7656160Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:15.7657358Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:16.3476050Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-09-27T16:14:16.3477229Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-09-27T16:14:16.3477995Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:14:16.3577735Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:14:16.9439990Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-09-27T16:14:16.9440823Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-09-27T16:14:16.9441621Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-09-27T16:14:16.9541025Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-09-27T16:14:17.5523094Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-09-27T16:14:17.5523898Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-09-27T16:14:17.5524689Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-09-27T16:14:17.5623948Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-09-27T16:14:18.1650170Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-09-27T16:14:18.1652052Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-09-27T16:14:18.1653088Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-09-27T16:14:18.1751352Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-09-27T16:14:18.7740296Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-09-27T16:14:18.7740909Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-09-27T16:14:18.7741697Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-09-27T16:14:18.7841075Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-09-27T16:14:19.3697554Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-09-27T16:14:19.3698364Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-09-27T16:14:19.3699361Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-09-27T16:14:19.3798644Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-09-27T16:14:19.9649654Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-09-27T16:14:19.9653362Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-09-27T16:14:19.9654149Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-09-27T16:14:19.9750871Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-09-27T16:14:21.1025294Z ok (11.623s) 2022-09-27T16:14:21.1044508Z test_mixture_of_experts_with_delay_before_free_offload_true_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38129 2022-09-27T16:14:21.1051033Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38130 2022-09-27T16:14:22.7445153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:14:22.7445667Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:14:22.7448736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:14:22.7449267Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:14:22.8057597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:14:22.8058117Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:14:22.8061892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:14:22.8062479Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:14:22.9983026Z dist init r=1, world=2 2022-09-27T16:14:22.9986930Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:14:23.0480850Z dist init r=0, world=2 2022-09-27T16:14:23.0485748Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:14:23.0486559Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:14:23.0494721Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:14:24.4265603Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:14:24.4266165Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:14:24.8930497Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:14:24.8931419Z warnings.warn( 2022-09-27T16:14:24.8953318Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:14:24.8954085Z warnings.warn( 2022-09-27T16:14:24.8961358Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-09-27T16:14:24.8987340Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-09-27T16:14:24.8988773Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:14:24.9065258Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:14:24.9183052Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:24.9185651Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:24.9194543Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-09-27T16:14:24.9198140Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-09-27T16:14:24.9199926Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:14:24.9297455Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:14:24.9416941Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:337: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:14:24.9418593Z shapes.append(param.shape) 2022-09-27T16:14:24.9421041Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:337: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:14:24.9422832Z shapes.append(param.shape) 2022-09-27T16:14:24.9426001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-09-27T16:14:24.9431160Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-09-27T16:14:24.9432552Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:14:24.9529128Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:14:24.9649702Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:24.9652198Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:24.9654523Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-09-27T16:14:24.9656379Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-09-27T16:14:24.9657796Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:14:24.9756898Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:14:24.9881083Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-09-27T16:14:24.9882227Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-09-27T16:14:24.9883609Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:14:24.9884961Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:14:24.9886029Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:24.9887274Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:25.0009602Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-09-27T16:14:25.0011003Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-09-27T16:14:25.0011748Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:14:25.0012438Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:14:25.0016842Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:25.0018253Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:25.0146416Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-09-27T16:14:25.0147548Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-09-27T16:14:25.0148405Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:14:25.0149102Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:14:25.0157175Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:25.0158461Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:25.6066703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:14:25.6068097Z warnings.warn(msg, FutureWarning) 2022-09-27T16:14:25.6069894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:14:25.6071587Z warnings.warn(msg, FutureWarning) 2022-09-27T16:14:25.6171198Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-09-27T16:14:25.6172602Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:14:25.6173638Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-09-27T16:14:25.6174929Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:14:26.2214071Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-09-27T16:14:26.2215123Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-09-27T16:14:26.2216862Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:14:26.2218287Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:14:26.8153726Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-09-27T16:14:26.8154744Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-09-27T16:14:26.8156194Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:14:26.8157517Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:14:27.4257572Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-09-27T16:14:27.4258344Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-09-27T16:14:27.4259144Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:14:27.4259811Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:14:27.4277660Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4279352Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4281253Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4282784Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4284618Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4285870Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4287369Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4288609Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4290004Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4291254Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4292442Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4293744Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4294940Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4296144Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4297342Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4298539Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4299719Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4300927Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4302118Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4303305Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4304559Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4305755Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4306997Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4308251Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4309442Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4310633Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4312158Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4313349Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4314542Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4315748Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4316950Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4318144Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4319425Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4320679Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4321877Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4323164Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4324356Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4325558Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4326766Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4327962Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4329159Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4330361Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4331559Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4332755Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4334011Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4335206Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4336409Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4337736Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4338933Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4340121Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4341326Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4342520Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4343716Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4344905Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4346107Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4347298Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4348584Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4349794Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4351149Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4352453Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4353644Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4354840Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4356017Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4357217Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4358416Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4359612Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4360808Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4361992Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4363251Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4364452Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4365637Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4366888Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4368082Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4369282Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4370473Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4371667Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4372853Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4374058Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4375248Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4376422Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4377667Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4378867Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4380057Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4381318Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4382514Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4383701Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4384897Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4386068Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4387266Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4388460Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4389643Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4390985Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4392261Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4393468Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4394649Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:27.4395933Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:28.0202105Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-09-27T16:14:28.0202667Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-09-27T16:14:28.0203446Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:14:28.0204136Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:14:28.6106962Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-09-27T16:14:28.6108039Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-09-27T16:14:28.6109533Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-09-27T16:14:28.6111228Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-09-27T16:14:29.2002221Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-09-27T16:14:29.2002762Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-09-27T16:14:29.2003555Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-09-27T16:14:29.2004250Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-09-27T16:14:29.7947124Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-09-27T16:14:29.7948204Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-09-27T16:14:29.7949648Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-09-27T16:14:29.7951463Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-09-27T16:14:30.3843416Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-09-27T16:14:30.3844437Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-09-27T16:14:30.3845855Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-09-27T16:14:30.3847301Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-09-27T16:14:30.9746625Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-09-27T16:14:30.9747716Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-09-27T16:14:30.9749149Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-09-27T16:14:30.9750535Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-09-27T16:14:31.5641167Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-09-27T16:14:31.5642188Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-09-27T16:14:31.5643695Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-09-27T16:14:31.5645458Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-09-27T16:14:31.5665884Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:31.5668530Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:31.5671472Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:31.5673902Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:31.5676361Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:31.5678786Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:32.7255127Z ok (11.623s) 2022-09-27T16:14:32.7276476Z test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38382 2022-09-27T16:14:32.7283264Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38383 2022-09-27T16:14:34.3903172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:14:34.3903653Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:14:34.3906212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:14:34.3906723Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:14:34.4039726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:14:34.4040214Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:14:34.4044343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:14:34.4044823Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:14:34.6427430Z dist init r=1, world=2 2022-09-27T16:14:34.6431367Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:14:34.6483660Z dist init r=0, world=2 2022-09-27T16:14:34.6489034Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:14:34.6489805Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:14:34.6535012Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:14:36.0161839Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:14:36.0162368Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:14:36.4594926Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:14:36.4595746Z warnings.warn( 2022-09-27T16:14:36.4603717Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:14:36.4604473Z warnings.warn( 2022-09-27T16:14:36.4627913Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-09-27T16:14:36.4638591Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-09-27T16:14:36.4640108Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:14:36.4732224Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-09-27T16:14:36.4846873Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:36.4849548Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:36.4860918Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-09-27T16:14:36.4861953Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-09-27T16:14:36.4863253Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:14:36.4864649Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-09-27T16:14:36.4981310Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:337: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:14:36.4983004Z shapes.append(param.shape) 2022-09-27T16:14:36.4985382Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:337: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:14:36.4987136Z shapes.append(param.shape) 2022-09-27T16:14:36.4993099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-09-27T16:14:36.4994112Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-09-27T16:14:36.4995443Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:14:36.4996799Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-09-27T16:14:36.5114861Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:36.5117339Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:36.5122328Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-09-27T16:14:36.5123348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-09-27T16:14:36.5124663Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:14:36.5126044Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-09-27T16:14:36.5246894Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-09-27T16:14:36.5248888Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-09-27T16:14:36.5249612Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:14:36.5252600Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:36.5349739Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-09-27T16:14:36.5352462Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:36.5473619Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-09-27T16:14:36.5475004Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-09-27T16:14:36.5475726Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:14:36.5482811Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:36.5577052Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-09-27T16:14:36.5583928Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:36.5709948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-09-27T16:14:36.5711642Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-09-27T16:14:36.5712343Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:14:36.5722402Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:36.5813074Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-09-27T16:14:36.5822753Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:37.1864234Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:14:37.1865618Z warnings.warn(msg, FutureWarning) 2022-09-27T16:14:37.1867473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:14:37.1868738Z warnings.warn(msg, FutureWarning) 2022-09-27T16:14:37.1967277Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-09-27T16:14:37.1968297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-09-27T16:14:37.1969590Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:14:37.1970992Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-09-27T16:14:37.7867693Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-09-27T16:14:37.7869090Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-09-27T16:14:37.7869913Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:14:37.7870607Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-09-27T16:14:38.3768500Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-09-27T16:14:38.3769987Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:14:38.3770518Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-09-27T16:14:38.3771174Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-09-27T16:14:39.0052329Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-09-27T16:14:39.0053104Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-09-27T16:14:39.0054677Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:14:39.0073750Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0075057Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0076315Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0077522Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0078739Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0079951Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0081170Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0082377Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0083826Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0085058Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0086255Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0087589Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0088781Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0089991Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0091175Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0092381Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0093578Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0094774Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0095973Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0097159Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0098417Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0099620Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0100824Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0102055Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0103258Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0104459Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0105654Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0106846Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0108037Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0109243Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0110435Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0111951Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0113232Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0114449Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0115650Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0116914Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0118116Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0119350Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0120561Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0121748Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0122931Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0124126Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0125330Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0126527Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0127770Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0128976Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0130164Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0131427Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0153324Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-09-27T16:14:39.0171614Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0172869Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0174080Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0175285Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0176478Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0177689Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0178892Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0180183Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0181431Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0182626Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0183831Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0185076Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0186283Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0187480Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0188670Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0189874Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0191336Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0192561Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0193751Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0195027Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0196227Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0197431Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0198633Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0199894Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0201092Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0202285Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0203487Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0204674Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0205863Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0207058Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0208259Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0209505Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0210705Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0211903Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0213090Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0214363Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0215553Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0216736Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0217934Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0219138Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0220386Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0221581Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0222779Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0223990Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0225222Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0226430Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0227601Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.0228858Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:39.6086489Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-09-27T16:14:39.6087044Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-09-27T16:14:39.6087835Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:14:39.6088527Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-09-27T16:14:40.1978448Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-09-27T16:14:40.1979013Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-09-27T16:14:40.1979792Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-09-27T16:14:40.1980463Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-09-27T16:14:40.7869007Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-09-27T16:14:40.7869537Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-09-27T16:14:40.7870301Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-09-27T16:14:40.7871282Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-09-27T16:14:41.3755141Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-09-27T16:14:41.3755686Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-09-27T16:14:41.3756483Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-09-27T16:14:41.3757177Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-09-27T16:14:41.9643621Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-09-27T16:14:41.9644166Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-09-27T16:14:41.9644973Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-09-27T16:14:41.9645888Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-09-27T16:14:42.5656150Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-09-27T16:14:42.5656697Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-09-27T16:14:42.5657478Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-09-27T16:14:42.5658148Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-09-27T16:14:43.1562080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-09-27T16:14:43.1562951Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-09-27T16:14:43.1563727Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-09-27T16:14:43.1564428Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-09-27T16:14:43.1585391Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:43.1586747Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:43.1587998Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:43.1589216Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:43.1590438Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:43.1592011Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:14:44.3484771Z ok (11.623s) 2022-09-27T16:14:44.3505833Z test_nested_always_wrap_model_offload_false_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38635 2022-09-27T16:14:44.3513431Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38636 2022-09-27T16:14:45.9613482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:14:45.9613999Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:14:45.9616721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:14:45.9617397Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:14:45.9756830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:14:45.9757293Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:14:45.9761494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:14:45.9761957Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:14:46.2243345Z dist init r=0, world=2 2022-09-27T16:14:46.2247159Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:14:46.2315714Z dist init r=1, world=2 2022-09-27T16:14:46.2321304Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:14:46.2322437Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:14:46.2349818Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:14:47.5963449Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:14:47.5963967Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:14:48.0315174Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.0324449Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.0355040Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:14:48.0355851Z warnings.warn( 2022-09-27T16:14:48.0364321Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:14:48.0365070Z warnings.warn( 2022-09-27T16:14:48.0819594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:14:48.0820316Z warnings.warn(msg, FutureWarning) 2022-09-27T16:14:48.0836250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:14:48.0836928Z warnings.warn(msg, FutureWarning) 2022-09-27T16:14:48.0888722Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.0891337Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.1423417Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.1425429Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.2026511Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.2027755Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.2570479Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.2571452Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.3109319Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.3111925Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.3650619Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.3652147Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.4185966Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.4187766Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.4725940Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.4727962Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.5284526Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.5286011Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.5822153Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.5824119Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.6359946Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:48.6361789Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:49.1626183Z ok (4.814s) 2022-09-27T16:14:49.1645641Z test_nested_always_wrap_model_offload_false_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38720 2022-09-27T16:14:49.1651841Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38721 2022-09-27T16:14:50.7992451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:14:50.7992980Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:14:50.7995796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:14:50.7996291Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:14:50.8207172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:14:50.8208172Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:14:50.8211844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:14:50.8212759Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:14:51.0588332Z dist init r=1, world=2 2022-09-27T16:14:51.0592118Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:14:51.0672401Z dist init r=0, world=2 2022-09-27T16:14:51.0677851Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:14:51.0678648Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:14:51.0695635Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:14:52.4402237Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:14:52.4403064Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:14:52.8820485Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:52.8821007Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:52.8858942Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:14:52.8860024Z warnings.warn( 2022-09-27T16:14:52.8861135Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:14:52.8861877Z warnings.warn( 2022-09-27T16:14:52.9535623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:14:52.9536286Z warnings.warn(msg, FutureWarning) 2022-09-27T16:14:52.9539401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:14:52.9540069Z warnings.warn(msg, FutureWarning) 2022-09-27T16:14:52.9593241Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:52.9593730Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.0336342Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.0336834Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.1084988Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.1085489Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.1832993Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.1833502Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.2575072Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.2575575Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.3322495Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.3322984Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.4061120Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.4061599Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.4820695Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.4821187Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.5569479Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.5570017Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.6329360Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.6329896Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.7077524Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:53.7078012Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:54.2772125Z ok (5.114s) 2022-09-27T16:14:54.2792344Z test_nested_always_wrap_model_offload_false_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38805 2022-09-27T16:14:54.2799673Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38806 2022-09-27T16:14:55.9146213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:14:55.9146706Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:14:55.9149497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:14:55.9149984Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:14:55.9416724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:14:55.9417173Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:14:55.9421304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:14:55.9421790Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:14:56.1708111Z dist init r=0, world=2 2022-09-27T16:14:56.1711988Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:14:56.1887698Z dist init r=1, world=2 2022-09-27T16:14:56.1893105Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:14:56.1893861Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:14:56.1917085Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:14:57.5625091Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:14:57.5625600Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:14:58.0212792Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.0220486Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.0251393Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:14:58.0252162Z warnings.warn( 2022-09-27T16:14:58.0260378Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:14:58.0261400Z warnings.warn( 2022-09-27T16:14:58.0912008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:14:58.0912672Z warnings.warn(msg, FutureWarning) 2022-09-27T16:14:58.0916419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:14:58.0917254Z warnings.warn(msg, FutureWarning) 2022-09-27T16:14:58.0969203Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.0969704Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.1689281Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.1689786Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.2415765Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.2416263Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.3143548Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.3144833Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.3867202Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.3867702Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.4594207Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.4595279Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.5313037Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.5313817Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.6051575Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.6052398Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.6771752Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.6772254Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.7499993Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.7500504Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.8226534Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:58.8227030Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:14:59.3920506Z ok (5.115s) 2022-09-27T16:14:59.3939891Z test_nested_always_wrap_model_offload_true_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38890 2022-09-27T16:14:59.3946107Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38891 2022-09-27T16:15:01.0198236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:01.0198904Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:01.0201829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:01.0202368Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:01.0260208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:01.0260713Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:01.0267746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:01.0268234Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:01.2885896Z dist init r=0, world=2 2022-09-27T16:15:01.2890082Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:15:01.3001617Z dist init r=1, world=2 2022-09-27T16:15:01.3006908Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:15:01.3007993Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:01.3095064Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:02.6799984Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:15:02.6800507Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:15:03.1059531Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.1067833Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.1098531Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:03.1099291Z warnings.warn( 2022-09-27T16:15:03.1107765Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:03.1108532Z warnings.warn( 2022-09-27T16:15:03.1218099Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.1218900Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.1352288Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.1353064Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.1485169Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.1486034Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.1618304Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.1618961Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.1750569Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.1751541Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.1883560Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.1884972Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.2532579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:03.2533293Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:03.2534225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:03.2535048Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:03.2582858Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.2583892Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.3267683Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.3268475Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.3966613Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.3967473Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.4652871Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.4653952Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.5317715Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.5318793Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.5997221Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.5998265Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.6662886Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.6663937Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.7330279Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.7331395Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.7998791Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.7999840Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.8668013Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.8669071Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.9350137Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:03.9351497Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:04.5045248Z ok (5.112s) 2022-09-27T16:15:04.5063763Z test_nested_always_wrap_model_offload_true_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38975 2022-09-27T16:15:04.5070113Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38976 2022-09-27T16:15:06.1171740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:06.1172255Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:06.1175105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:06.1175621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:06.1322623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:06.1323072Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:06.1326923Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:06.1327410Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:06.3767967Z dist init r=1, world=2 2022-09-27T16:15:06.3771735Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:15:06.3827340Z dist init r=0, world=2 2022-09-27T16:15:06.3832475Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:15:06.3833847Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:06.3874935Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:07.7709978Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:15:07.7710538Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:15:08.2018682Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.2026636Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.2056949Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:08.2057737Z warnings.warn( 2022-09-27T16:15:08.2066050Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:08.2066809Z warnings.warn( 2022-09-27T16:15:08.2182760Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.2183266Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.2322835Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.2323324Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.2461431Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.2461920Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.2599545Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.2600030Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.2738233Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.2738735Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.2876962Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.2877479Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.3689324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:08.3689996Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:08.3690922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:08.3691766Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:08.3742308Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.3742786Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.4586343Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.4586833Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.5445388Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.5445857Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.6293268Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.6293772Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.7340954Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.7341468Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.8191790Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.8192282Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.9018231Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.9018725Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.9850603Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:08.9851087Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:09.0685707Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:09.0686594Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:09.1524044Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:09.1524542Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:09.2376608Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:09.2377118Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:09.8170373Z ok (5.312s) 2022-09-27T16:15:09.8189778Z test_nested_always_wrap_model_offload_true_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39060 2022-09-27T16:15:09.8196553Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39061 2022-09-27T16:15:11.4257687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:11.4258403Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:11.4259792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:11.4260253Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:11.4607183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:11.4607649Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:11.4611972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:11.4612431Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:11.6812926Z dist init r=1, world=2 2022-09-27T16:15:11.6817024Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:15:11.7049759Z dist init r=0, world=2 2022-09-27T16:15:11.7055022Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:15:11.7055810Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:11.7122834Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:13.0837921Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:15:13.0838472Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:15:13.5248622Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.5256227Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.5288154Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:13.5288950Z warnings.warn( 2022-09-27T16:15:13.5297065Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:13.5297839Z warnings.warn( 2022-09-27T16:15:13.5419500Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.5420021Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.5564508Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.5565007Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.5708493Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.5709013Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.5852406Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.5852906Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.5996800Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.5997614Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.6141173Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.6141673Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.6969231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:13.6969926Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:13.6971652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:13.6972480Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:13.7026652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.7027167Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.7898752Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.7899254Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.8781893Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.8782411Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.9653600Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:13.9654114Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:14.0661576Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:14.0662128Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:14.1539344Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:14.1539828Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:14.2388664Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:14.2389166Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:14.3241605Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:14.3242116Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:14.4100213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:14.4100713Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:14.4962720Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:14.4963204Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:14.5838952Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:14.5839453Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:15.1305425Z ok (5.313s) 2022-09-27T16:15:15.1323892Z test_nested_wrapped_model_offload_false_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39145 2022-09-27T16:15:15.1330091Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39146 2022-09-27T16:15:16.8191940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:16.8192513Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:16.8194977Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:16.8195470Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:16.8579990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:16.8580470Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:16.8584644Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:16.8585301Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:17.0695186Z dist init r=1, world=2 2022-09-27T16:15:17.0698799Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:15:17.1037070Z dist init r=0, world=2 2022-09-27T16:15:17.1042189Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:15:17.1042995Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:17.1105296Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:18.4914163Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:15:18.4914731Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:15:18.9294707Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:18.9295252Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:18.9327723Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:18.9328498Z warnings.warn( 2022-09-27T16:15:18.9329608Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:18.9330341Z warnings.warn( 2022-09-27T16:15:18.9679330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:18.9680012Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:18.9684683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:18.9685340Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:18.9738125Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:18.9738611Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.0150406Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.0151163Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.0564446Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.0564923Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.0979819Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.0980314Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.1392877Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.1393508Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.1808884Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.1809375Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.2219150Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.2219646Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.2633171Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.2633665Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.3049520Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.3050007Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.3468273Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.3468748Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.3894790Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.3895306Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:19.4141533Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4142824Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4144058Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4145292Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4146505Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4148013Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4149253Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4150455Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4152065Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4153275Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4154497Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4155713Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4156948Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4158143Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4159359Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4160575Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4161778Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4163093Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4164320Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4165524Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4166814Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4168002Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4169217Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4170425Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4171626Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4172828Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4174030Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4175248Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4176454Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4177715Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4178911Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.4180121Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:19.9421363Z ok (4.811s) 2022-09-27T16:15:19.9440667Z test_nested_wrapped_model_offload_false_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39230 2022-09-27T16:15:19.9447460Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39231 2022-09-27T16:15:21.6427582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:21.6428105Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:21.6430803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:21.6431562Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:21.6495176Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:21.6495708Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:21.6500002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:21.6500518Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:21.8877443Z dist init r=0, world=2 2022-09-27T16:15:21.8880356Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:15:21.8985286Z dist init r=1, world=2 2022-09-27T16:15:21.8990690Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:15:21.8991817Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:21.9085441Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:23.2486525Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:15:23.2487533Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:15:23.6806615Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:23.6812410Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:23.6839350Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:23.6841016Z warnings.warn( 2022-09-27T16:15:23.6850346Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:23.6851920Z warnings.warn( 2022-09-27T16:15:23.7506199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:23.7507522Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:23.7509370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:23.7511380Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:23.7560679Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:23.7561649Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:23.8117261Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:23.8118220Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:23.8676553Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:23.8677516Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:23.9238337Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:23.9239313Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:23.9791812Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:23.9792776Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:24.0349460Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:24.0350414Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:24.0904264Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:24.0905220Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:24.1464576Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:24.1465545Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:24.2029059Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:24.2030039Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:24.2593922Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:24.2594890Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:24.2789743Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2792576Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2795292Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2797832Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2800262Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2802992Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2805360Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2807848Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2810250Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2812710Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2815131Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2817681Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2820226Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2822613Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2825237Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2827672Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2830155Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2832995Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2835485Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.2837885Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:24.3185998Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:24.3187021Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:24.8541240Z ok (4.912s) 2022-09-27T16:15:24.8560015Z test_nested_wrapped_model_offload_false_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39315 2022-09-27T16:15:24.8566089Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39316 2022-09-27T16:15:26.4948328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:26.4948824Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:26.4950957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:26.4951705Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:26.5292117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:26.5292585Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:26.5296615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:26.5297108Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:26.7561777Z dist init r=1, world=2 2022-09-27T16:15:26.7565534Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:15:26.7684687Z dist init r=0, world=2 2022-09-27T16:15:26.7689781Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:15:26.7690582Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:26.7770737Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:28.1178334Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:15:28.1178847Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:15:28.5516199Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:28.5516762Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:28.5547496Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:28.5548673Z warnings.warn( 2022-09-27T16:15:28.5549783Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:28.5550525Z warnings.warn( 2022-09-27T16:15:28.6049398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:28.6050079Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:28.6051790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:28.6052429Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:28.6103485Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:28.6104534Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:28.6657158Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:28.6658687Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:28.7215219Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:28.7216990Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:28.7781347Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:28.7783201Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:28.8335993Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:28.8337808Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:28.8891842Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:28.8894048Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:28.9447137Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:28.9449279Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:29.0005215Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:29.0007875Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:29.0575690Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:29.0576213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:29.1222510Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:29.1414162Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:29.1415590Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1417973Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1419700Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1421415Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1422666Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1423871Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1425095Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1426316Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1427528Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1428733Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1430072Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1431621Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1432824Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1434135Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1435339Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1436545Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1437753Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1438953Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1440140Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1441353Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:29.1803074Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:29.1803560Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:29.6674702Z ok (4.813s) 2022-09-27T16:15:29.6693248Z test_nested_wrapped_model_offload_true_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39400 2022-09-27T16:15:29.6699445Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39401 2022-09-27T16:15:31.3324409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:31.3325143Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:31.3326173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:31.3326646Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:31.3461301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:31.3461804Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:31.3465960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:31.3466448Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:31.5902452Z dist init r=0, world=2 2022-09-27T16:15:31.5906527Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:15:31.5938884Z dist init r=1, world=2 2022-09-27T16:15:31.5944415Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:15:31.5945772Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:31.6010124Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:32.9617116Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:15:32.9617651Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:15:33.4128216Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.4128789Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.4160934Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:33.4161760Z warnings.warn( 2022-09-27T16:15:33.4162888Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:33.4163623Z warnings.warn( 2022-09-27T16:15:33.4258005Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.4258516Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.4312331Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:337: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:15:33.4313187Z shapes.append(param.shape) 2022-09-27T16:15:33.4314662Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:337: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:15:33.4315498Z shapes.append(param.shape) 2022-09-27T16:15:33.4375531Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.4376042Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.4486972Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.4487451Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.4511828Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.4513279Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.4514515Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.4515841Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.4517648Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.4518886Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.4520105Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.4521299Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.4605309Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.4605815Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.4653127Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.4655000Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.4721490Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.4721970Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.4832461Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.4832951Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.4859750Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.4861167Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.4862394Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.4863590Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.4864824Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.4866036Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.4867253Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.4868471Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:33.5359678Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:33.5360358Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:33.5361405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:33.5362077Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:33.5412406Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.5414574Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.5973172Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.5978768Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.6533165Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.6534142Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.7093240Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.7094219Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.7648445Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.7649383Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.8203584Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.8204563Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.8377055Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:915: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:15:33.8377950Z return iter(self.unbind(0)) 2022-09-27T16:15:33.8379086Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:915: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:15:33.8379856Z return iter(self.unbind(0)) 2022-09-27T16:15:33.8777392Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.8778351Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.9321122Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.9322109Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.9869343Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:33.9870330Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:34.0417864Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:34.0418848Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:34.0962282Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:34.0963255Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:34.5793736Z ok (4.912s) 2022-09-27T16:15:34.5811763Z test_nested_wrapped_model_offload_true_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39485 2022-09-27T16:15:34.5817895Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39486 2022-09-27T16:15:36.2520289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:36.2520814Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:36.2523900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:36.2524390Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:36.2577772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:36.2578216Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:36.2581825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:36.2582486Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:36.4948583Z dist init r=1, world=2 2022-09-27T16:15:36.4952687Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:15:36.5062405Z dist init r=0, world=2 2022-09-27T16:15:36.5067547Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:15:36.5068459Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:36.5157257Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:37.8535500Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:15:37.8536042Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:15:38.2844478Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.2845030Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.2876587Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:38.2877356Z warnings.warn( 2022-09-27T16:15:38.2878461Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:38.2879214Z warnings.warn( 2022-09-27T16:15:38.2976753Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.2977259Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.3023804Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3025642Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3094937Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.3095460Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.3209958Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.3210442Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.3235033Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3236289Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3237664Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3238887Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3240106Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3241331Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3242532Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3243750Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3244970Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3246170Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3328981Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.3329496Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.3377038Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3378296Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3446915Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.3447391Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.3562301Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.3562790Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.3588799Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3590040Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3591600Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3592850Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3594039Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3595259Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3596476Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3597692Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3599021Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.3600254Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.4207385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:38.4208062Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:38.4209389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:38.4210057Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:38.4260662Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.4261544Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.4933675Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.4934215Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.5611255Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.5611752Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.6288446Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.6288933Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.6960745Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.6961255Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.7738970Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.7739560Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.7881075Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7882549Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7883898Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7885115Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7886593Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7887826Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7889037Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7890350Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7891570Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7892774Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7893988Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7895190Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7896400Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7897606Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7898795Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7900005Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7901281Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7902484Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7903694Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7904954Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7906160Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7907359Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7908556Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7909754Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7911230Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.7912459Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:38.8418204Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.8418707Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.9094344Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.9094828Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.9758151Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:38.9758651Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:39.0424996Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:39.0425620Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:39.1085370Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:39.1085875Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:39.6919154Z ok (5.112s) 2022-09-27T16:15:39.6938059Z test_nested_wrapped_model_offload_true_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39570 2022-09-27T16:15:39.6944681Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39571 2022-09-27T16:15:41.3486082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:41.3486847Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:41.3489001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:41.3489489Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:41.3869801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:41.3870282Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:41.3875348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:41.3875808Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:41.6036940Z dist init r=0, world=2 2022-09-27T16:15:41.6040976Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:15:41.6322472Z dist init r=1, world=2 2022-09-27T16:15:41.6328025Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:15:41.6329442Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:41.6346441Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:42.9838395Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:15:42.9838938Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:15:43.4184999Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.4192702Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.4216686Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:43.4217477Z warnings.warn( 2022-09-27T16:15:43.4226553Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:43.4227315Z warnings.warn( 2022-09-27T16:15:43.4327113Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.4327617Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.4374758Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4376029Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4444614Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.4445092Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.4560538Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.4561699Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.4585044Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4586287Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4587497Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4588728Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4589947Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4591410Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4592622Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4593826Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4595145Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4596386Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4682101Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.4682829Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.4729482Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4731067Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4799560Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.4800693Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.4916313Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.4917285Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.4942650Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4943981Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4945426Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4946648Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4947869Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4949077Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4950411Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4951925Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4953137Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.4954333Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.5560612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:43.5561280Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:43.5562176Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:43.5562836Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:43.5612068Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.5612569Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.6281691Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.6282233Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.6956063Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.6956616Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.7631539Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.7632040Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.8300701Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.8303802Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.8976559Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.8977888Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.9116959Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9118907Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9120473Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9121711Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9122928Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9124205Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9125421Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9126627Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9127848Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9129054Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9130260Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9131466Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9132677Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9133878Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9135122Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9136328Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9137527Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9138798Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9140006Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9141212Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9142424Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9143630Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9144815Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9146025Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9147237Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9148436Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:43.9650500Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:43.9652260Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:44.0310861Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:44.0311547Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:44.1050825Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:44.1051330Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:44.1717649Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:44.1718122Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:44.2371694Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:44.2372382Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:15:44.8042835Z ok (5.112s) 2022-09-27T16:15:44.8062331Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39655 2022-09-27T16:15:44.8068671Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39656 2022-09-27T16:15:46.4904759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:46.4905703Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:46.4907314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:46.4908252Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:46.5356912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:46.5357838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:46.5363184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:46.5364151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:46.7403788Z dist init r=0, world=2 2022-09-27T16:15:46.7407654Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:15:46.7698884Z dist init r=1, world=2 2022-09-27T16:15:46.7704322Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:15:46.7705651Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:46.7713321Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:48.1233088Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:15:48.1233623Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:15:48.5694516Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:48.5695334Z warnings.warn( 2022-09-27T16:15:48.5795031Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:48.5795837Z warnings.warn( 2022-09-27T16:15:48.6043590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:48.6044275Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:48.6047087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:48.6047921Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:48.9315018Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9316347Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9317669Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9318991Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9320214Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9321423Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9322649Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9323844Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9325068Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9326396Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9327619Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9328818Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9330111Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9331289Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9332482Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9333686Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9334874Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9336079Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9337282Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9338484Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9339672Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:48.9340914Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:49.4158084Z ok (4.611s) 2022-09-27T16:15:49.4177705Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39740 2022-09-27T16:15:49.4184251Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39741 2022-09-27T16:15:51.0528556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:51.0529352Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:51.0531371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:51.0531860Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:51.0696181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:51.0696645Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:51.0701245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:51.0701718Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:51.3140654Z dist init r=1, world=2 2022-09-27T16:15:51.3144538Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:15:51.3186125Z dist init r=0, world=2 2022-09-27T16:15:51.3191864Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:15:51.3192870Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:51.3247522Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:52.7109488Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:15:52.7110059Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:15:53.1417501Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:53.1419023Z warnings.warn( 2022-09-27T16:15:53.1470456Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:53.1472249Z warnings.warn( 2022-09-27T16:15:53.1825759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:53.1827195Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:53.1829919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:53.1831598Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:53.6076455Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6077723Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6079166Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6080378Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6081598Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6082819Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6084021Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6085893Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6088239Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6090654Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6093144Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6095744Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6098253Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6100757Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6103354Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6105782Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6108290Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6111115Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6113582Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:53.6116075Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:54.1279101Z ok (4.712s) 2022-09-27T16:15:54.1298230Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39825 2022-09-27T16:15:54.1304798Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39826 2022-09-27T16:15:55.8063494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:55.8064018Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:55.8067683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:55.8068197Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:55.8302649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:15:55.8303157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:15:55.8307696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:15:55.8308256Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:15:56.0713033Z dist init r=0, world=2 2022-09-27T16:15:56.0717072Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:15:56.0790376Z dist init r=1, world=2 2022-09-27T16:15:56.0795616Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:15:56.0796392Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:56.0819647Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:15:57.4749066Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:15:57.4749587Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:15:57.9128385Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:57.9129441Z warnings.warn( 2022-09-27T16:15:57.9183199Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:15:57.9183998Z warnings.warn( 2022-09-27T16:15:57.9529119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:57.9529798Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:57.9530718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:15:57.9531351Z warnings.warn(msg, FutureWarning) 2022-09-27T16:15:58.3815240Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3816562Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3817787Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3819242Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3820489Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3821713Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3823032Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3824231Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3825433Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3826642Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3827848Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3829059Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3830265Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3831732Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3832939Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3834227Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3835446Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3836650Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3837926Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.3839128Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:15:58.9397999Z ok (4.812s) 2022-09-27T16:15:58.9416873Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39910 2022-09-27T16:15:58.9423256Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39911 2022-09-27T16:16:00.5285661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:00.5286675Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:00.5288291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:00.5289236Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:00.6038669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:00.6039635Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:00.6041892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:00.6042854Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:00.7689497Z dist init r=0, world=2 2022-09-27T16:16:00.7693485Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:16:00.8510668Z dist init r=1, world=2 2022-09-27T16:16:00.8515702Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:16:00.8516783Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:00.8606539Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:02.2471208Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:16:02.2472206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:16:02.6951085Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:02.6952574Z warnings.warn( 2022-09-27T16:16:02.6977834Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:02.6979363Z warnings.warn( 2022-09-27T16:16:02.7103581Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:02.7106462Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:02.7285852Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:02.7288417Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:02.7290992Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:02.7293520Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:02.7295995Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:02.7298582Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:02.7301064Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:02.7303778Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:02.7306317Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:02.7308843Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:02.7413525Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:02.7417471Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:02.7599393Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:951: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:16:02.7600945Z subtensor.view(shape) 2022-09-27T16:16:02.7603379Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:951: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:16:02.7605025Z subtensor.view(shape) 2022-09-27T16:16:02.7933931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:02.7935337Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:02.7942101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:02.7943370Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:03.0276118Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0278737Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0281548Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0284032Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0286527Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0288234Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0289457Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0290662Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0291873Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0293086Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0294292Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0295494Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0296695Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0297895Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0299178Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0300399Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0301591Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0302857Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0304060Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0305273Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0306466Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0307662Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0308858Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0310063Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0312126Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.0313400Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:03.7536104Z ok (4.814s) 2022-09-27T16:16:03.7556727Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39995 2022-09-27T16:16:03.7562901Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39996 2022-09-27T16:16:05.3838263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:05.3838775Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:05.3841873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:05.3842362Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:05.4008786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:05.4009501Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:05.4013317Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:05.4013799Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:05.6487993Z dist init r=1, world=2 2022-09-27T16:16:05.6491613Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:16:05.6550947Z dist init r=0, world=2 2022-09-27T16:16:05.6556115Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:16:05.6557011Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:05.6594199Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:07.0255960Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:16:07.0256508Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:16:07.4533477Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:07.4534298Z warnings.warn( 2022-09-27T16:16:07.4558854Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:07.4559632Z warnings.warn( 2022-09-27T16:16:07.4687677Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.4688925Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.4862872Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.4864414Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.4865675Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.4866884Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.4868195Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.4869406Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.4870615Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.4872223Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.4873438Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.4874640Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.4987243Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.4988468Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.5162784Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.5164154Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.5165358Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.5166568Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.5167862Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.5169070Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.5170269Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.5171485Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.5563992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:07.5564668Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:07.5565566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:07.5566235Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:07.8233819Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8235447Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8237179Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8238593Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8239817Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8241035Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8242364Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8243568Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8244778Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8245970Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8247179Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8248384Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8249586Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8250782Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8252025Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8253246Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8254437Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8255637Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8256868Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8258070Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8259270Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8260470Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8261662Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8262863Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8264072Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8265263Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8266511Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:07.8267702Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:08.5670619Z ok (4.813s) 2022-09-27T16:16:08.5691040Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40080 2022-09-27T16:16:08.5697183Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40081 2022-09-27T16:16:10.2068496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:10.2069205Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:10.2071893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:10.2072375Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:10.2257431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:10.2257898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:10.2262467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:10.2262935Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:10.4731348Z dist init r=1, world=2 2022-09-27T16:16:10.4735743Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:16:10.4810362Z dist init r=0, world=2 2022-09-27T16:16:10.4815917Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:16:10.4817058Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:10.4838860Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:11.8505123Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:16:11.8505653Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:16:12.2965261Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:12.2966084Z warnings.warn( 2022-09-27T16:16:12.3005190Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:12.3005921Z warnings.warn( 2022-09-27T16:16:12.3138006Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3141841Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3325574Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3326810Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3328184Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3329391Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3330605Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3331828Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3333004Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3334211Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3335416Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3336607Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3456251Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3459848Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3642268Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3643478Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3644805Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3646016Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3647230Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3648430Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3649633Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.3650836Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.4051606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:12.4052290Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:12.4056436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:12.4057099Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:12.6775218Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6776644Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6777881Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6779090Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6780440Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6781645Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6782844Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6784058Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6785263Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6786453Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6787673Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6788878Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6790090Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6791642Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6792863Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6794060Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6795333Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6796526Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6797712Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6798914Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6800114Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6801317Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6802510Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6803707Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6804899Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6806149Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6807348Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:12.6808526Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:13.4805879Z ok (4.913s) 2022-09-27T16:16:13.4825597Z test_transformer_offload_false_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40165 2022-09-27T16:16:13.4832132Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40166 2022-09-27T16:16:15.1687298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:15.1687805Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:15.1690898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:15.1691393Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:15.1902974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:15.1903437Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:15.1907364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:15.1907845Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:15.4432074Z dist init r=1, world=2 2022-09-27T16:16:15.4436594Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:16:15.4459627Z dist init r=0, world=2 2022-09-27T16:16:15.4464880Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:16:15.4465685Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:15.4539319Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:16.8131260Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:16:16.8132051Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:16:17.4531350Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:17.4532114Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:17.4856809Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:17.4857729Z warnings.warn( 2022-09-27T16:16:17.4871838Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:17.4872631Z warnings.warn( 2022-09-27T16:16:17.5794634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:17.5795362Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:17.5797348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:17.5798193Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:17.6092874Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:17.6093853Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:17.7533354Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:17.7534422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:17.7735574Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:17.7737878Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:17.7739540Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:17.7741692Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:17.7742917Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:17.7744109Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:17.7745321Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:17.7746722Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:17.8991821Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:17.8992682Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:18.0437583Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:18.0438575Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:18.1884846Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:18.1886008Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:18.2089510Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:18.2091673Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:18.2093961Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:18.2095197Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:18.2096408Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:18.2097611Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:18.2098844Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:18.2100023Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:18.3341342Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:18.3342384Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:18.4804609Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:18.4805790Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:18.6283205Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:18.6284022Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:18.6488060Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:18.6490424Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:18.6492470Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:18.6494450Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:18.6495671Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:18.6496891Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:18.6498098Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:18.6499305Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:18.7761372Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:18.7762436Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:18.9225069Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:18.9226171Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:19.0687709Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:19.0688773Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:19.0900027Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:19.0902367Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:19.0904530Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:19.0905938Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:19.0907271Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:19.0908487Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:19.0909704Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:19.0911221Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:19.6948492Z ok (6.214s) 2022-09-27T16:16:19.6968076Z test_transformer_offload_false_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40250 2022-09-27T16:16:19.6974654Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40251 2022-09-27T16:16:21.3446113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:21.3446962Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:21.3449306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:21.3449802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:21.4149138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:21.4149640Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:21.4152768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:21.4153257Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:21.5977445Z dist init r=1, world=2 2022-09-27T16:16:21.5981001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:16:21.6585706Z dist init r=0, world=2 2022-09-27T16:16:21.6591047Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:16:21.6592696Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:21.6593440Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:23.0551915Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:16:23.0552478Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:16:23.6982934Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:23.6983498Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:23.7230061Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:23.7231568Z warnings.warn( 2022-09-27T16:16:23.7232705Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:23.7233432Z warnings.warn( 2022-09-27T16:16:23.8373734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:23.8374431Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:23.8380126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:23.8380797Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:23.8694334Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:23.8695354Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:24.0477928Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:24.0478496Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:24.0681144Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:24.0682426Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:24.0683634Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:24.0685083Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:24.0686336Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:24.0687526Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:24.0688861Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:24.0690079Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:24.2170834Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:24.2171342Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:24.3839900Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:24.3840386Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:24.5510404Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:24.5511256Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:24.5715309Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:24.5716558Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:24.5718650Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:24.5719993Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:24.5721238Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:24.5722658Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:24.5723902Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:24.5725108Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:24.7218770Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:24.7219299Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:24.8895555Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:24.8896062Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:25.0596353Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:25.0596884Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:25.0804204Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:25.0805509Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:25.0806722Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:25.0807942Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:25.0809171Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:25.0810384Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:25.0811587Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:25.0813088Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:25.2317121Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:25.2317639Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:25.4010571Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:25.4011053Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:25.5700950Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:25.5701590Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:25.5915086Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:25.5916343Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:25.5917569Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:25.5918798Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:25.5920022Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:25.5921226Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:25.5922434Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:25.5923637Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:26.2097469Z ok (6.515s) 2022-09-27T16:16:26.2116205Z test_transformer_offload_false_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40335 2022-09-27T16:16:26.2122652Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40336 2022-09-27T16:16:27.8611826Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:27.8612336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:27.8614170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:27.8614649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:27.9003588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:27.9004036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:27.9008329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:27.9008972Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:28.1255036Z dist init r=1, world=2 2022-09-27T16:16:28.1259152Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:16:28.1451254Z dist init r=0, world=2 2022-09-27T16:16:28.1456300Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:16:28.1457454Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:28.1463659Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:29.5397890Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:16:29.5398441Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:16:30.1674688Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:30.1698552Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:30.1920547Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:30.1921334Z warnings.warn( 2022-09-27T16:16:30.1949191Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:30.1949958Z warnings.warn( 2022-09-27T16:16:30.3064157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:30.3064860Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:30.3068649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:30.3069325Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:30.3362938Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:30.3369087Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:30.5156421Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:30.5161538Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:30.5356989Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:30.5358255Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:30.5359636Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:30.5360863Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:30.5368639Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:30.5369885Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:30.5371094Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:30.5372296Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:30.6812762Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:30.6821399Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:30.8462972Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:30.8468825Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:31.0106716Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:31.0112432Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:31.0308516Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:31.0310033Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:31.0311603Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:31.0312819Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:31.0321997Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:31.0323242Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:31.0324450Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:31.0325675Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:31.1769709Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:31.1775320Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:31.3425519Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:31.3431388Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:31.5104806Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:31.5106864Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:31.5309575Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:31.5311122Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:31.5312354Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:31.5313929Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:31.5319169Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:31.5320397Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:31.5321758Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:31.5322971Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:31.6798118Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:31.6804494Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:31.8472976Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:31.8479217Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:32.0139142Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:32.0143423Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:32.0349545Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:32.0351054Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:32.0352335Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:32.0353560Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:32.0359779Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:32.0361189Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:32.0362433Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:32.0363650Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:32.7243071Z ok (6.514s) 2022-09-27T16:16:32.7264695Z test_transformer_offload_true_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40420 2022-09-27T16:16:32.7272005Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40421 2022-09-27T16:16:34.3912649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:34.3913193Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:34.3927245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:34.3927733Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:34.4155582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:34.4156084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:34.4159995Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:34.4160481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:34.6487474Z dist init r=1, world=2 2022-09-27T16:16:34.6491740Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:16:34.6523886Z dist init r=0, world=2 2022-09-27T16:16:34.6529603Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:16:34.6530589Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:34.6595415Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:36.0152848Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:16:36.0153374Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:16:36.6449883Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:36.6469135Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:36.6690608Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:36.6692159Z warnings.warn( 2022-09-27T16:16:36.6718794Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:36.6720416Z warnings.warn( 2022-09-27T16:16:36.6841080Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.6843622Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.7091599Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:36.7095479Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:36.7451413Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.7453981Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.7761074Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:36.7762052Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:36.8180799Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8183483Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8465283Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:36.8466274Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:36.8649988Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8652692Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8655265Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8658369Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8660970Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8663517Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8666276Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8668705Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8671303Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8673800Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8676382Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8678778Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8681883Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8684286Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8686711Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8689419Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8691885Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8694501Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8697176Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8699680Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8702206Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8704684Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8707146Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8709671Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8886262Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.8888785Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.9138365Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:36.9142386Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:36.9498120Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.9500777Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:36.9745551Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:36.9750358Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:37.0106358Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:37.0108986Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:37.0353773Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:37.0360303Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:37.1389818Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:915: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:16:37.1390660Z return iter(self.unbind(0)) 2022-09-27T16:16:37.1392144Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:915: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:16:37.1392923Z return iter(self.unbind(0)) 2022-09-27T16:16:37.1714621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:37.1715312Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:37.1728322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:37.1728991Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:37.2018405Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:37.2022072Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:37.3649028Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:37.3654595Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:37.5277329Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:37.5282984Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:37.5706790Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:37.5708818Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:37.5710229Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:37.5711946Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:37.5713209Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:37.5714445Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:37.5715663Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:37.5716879Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:37.6925261Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:37.6931878Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:37.8556939Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:37.8563476Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:38.0316287Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:38.0320228Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:38.0724145Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:778: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:16:38.0725018Z return torch._VF.split_with_sizes(self, split_size, dim) 2022-09-27T16:16:38.0726404Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:778: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:16:38.0727248Z return torch._VF.split_with_sizes(self, split_size, dim) 2022-09-27T16:16:38.1928495Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:38.1934700Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:38.3532601Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:38.3538486Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:38.5138210Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:38.5144953Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:38.6763032Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:38.6769281Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:38.8382889Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:38.8386993Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:39.5399424Z ok (6.815s) 2022-09-27T16:16:39.5420175Z test_transformer_offload_true_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40505 2022-09-27T16:16:39.5427104Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40506 2022-09-27T16:16:41.2038214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:41.2038738Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:41.2041823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:41.2042311Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:41.2318818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:41.2319290Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:41.2323927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:41.2324432Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:41.4646327Z dist init r=1, world=2 2022-09-27T16:16:41.4650222Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:16:41.4810419Z dist init r=0, world=2 2022-09-27T16:16:41.4815803Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:16:41.4816858Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:41.4854705Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:42.8603638Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:16:42.8604157Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:16:43.5043609Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:43.5062797Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:43.5286073Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:43.5286860Z warnings.warn( 2022-09-27T16:16:43.5302715Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:43.5303627Z warnings.warn( 2022-09-27T16:16:43.5432774Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.5434276Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.5681858Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:43.5682376Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:43.6037110Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.6038361Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.6285866Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:43.6286370Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:43.6641516Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.6642770Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.6889875Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:43.6890376Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:43.7062042Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7063381Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7064609Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7065845Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7067227Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7068440Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7069662Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7071185Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7072398Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7073623Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7074837Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7076051Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7077375Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7078628Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7079834Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7081046Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7082343Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7083534Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7084738Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7085955Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7087158Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7088362Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7089580Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7090797Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7259072Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7260333Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7509240Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:43.7509759Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:43.7863356Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.7864796Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.8112886Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:43.8113399Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:43.8469176Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.8470445Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:43.8720460Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:43.8720970Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:43.9367515Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:915: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:16:43.9368318Z return iter(self.unbind(0)) 2022-09-27T16:16:43.9369444Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:915: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:16:43.9370231Z return iter(self.unbind(0)) 2022-09-27T16:16:44.0405046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:44.0405734Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:44.0409919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:44.0410621Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:44.0698442Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:44.0698948Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:44.2646925Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:44.2647445Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:44.4626829Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:44.4627655Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:44.4808966Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:44.4810242Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:44.6607619Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:44.6608136Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:44.8562978Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:44.8563475Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:45.0938005Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:45.0938538Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:45.2862672Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:45.2863195Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:45.3066976Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:45.3068268Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:45.3659089Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:45.3660699Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:45.3662304Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:45.3663893Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:45.3665231Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:45.3666790Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:45.3668002Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:45.3669207Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:45.3670429Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:45.3672001Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:45.5223493Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:45.5224019Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:45.7179006Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:45.7179523Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:45.8869041Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:45.8870322Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:45.9118655Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:45.9119167Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:46.1047470Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:46.1048126Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:46.1442499Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:46.1443885Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:46.7558667Z ok (7.216s) 2022-09-27T16:16:46.7578223Z test_transformer_offload_true_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40590 2022-09-27T16:16:46.7584355Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40591 2022-09-27T16:16:48.4201812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:48.4202524Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:48.4204758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:48.4205239Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:48.4439049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:48.4439519Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:48.4444167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:48.4444655Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:48.6801331Z dist init r=1, world=2 2022-09-27T16:16:48.6805654Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:16:48.6937181Z dist init r=0, world=2 2022-09-27T16:16:48.6942634Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:16:48.6943938Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:48.7010013Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:50.0618795Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:16:50.0619299Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:16:50.7026913Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:50.7027468Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:50.7268374Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:50.7269158Z warnings.warn( 2022-09-27T16:16:50.7293516Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:16:50.7294323Z warnings.warn( 2022-09-27T16:16:50.7444273Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.7445943Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.7727904Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:50.7728655Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:50.8098895Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.8100151Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.8381350Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:50.8381840Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:50.8753775Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.8755028Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9043935Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:50.9044470Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:50.9233630Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9235454Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9237214Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9238745Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9239988Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9241207Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9242518Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9243733Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9244934Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9246145Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9247340Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9248553Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9323277Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9324813Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9326259Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9327619Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9328849Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9330057Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9331279Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9332570Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9333762Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9334974Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9336188Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9337385Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9522823Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9525552Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:50.9811306Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:50.9811811Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:51.0183217Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:51.0184620Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:51.0466864Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:51.0467363Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:51.0844385Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:51.0845773Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:51.1108451Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:51.1108953Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:51.1773647Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:915: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:16:51.1774666Z return iter(self.unbind(0)) 2022-09-27T16:16:51.1776488Z /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:915: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:326.) 2022-09-27T16:16:51.1777249Z return iter(self.unbind(0)) 2022-09-27T16:16:51.2854088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:51.2854769Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:51.2864139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:16:51.2864810Z warnings.warn(msg, FutureWarning) 2022-09-27T16:16:51.3189092Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:51.3189593Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:51.5217380Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:51.5217868Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:51.7254219Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:51.7254727Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:51.7423538Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:51.7424818Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:51.9298848Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:51.9299348Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:52.1351084Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:52.1351889Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:52.3823121Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:52.3823638Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:52.5839960Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:52.5840470Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:52.6043786Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:52.6045146Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:52.6650804Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:52.6652105Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:52.6653559Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:52.6655263Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:52.6656481Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:52.6658303Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:52.6659546Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:52.6660745Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:52.6662070Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:52.6663285Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:52.8275810Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:52.8276325Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:53.0288533Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:53.0289057Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:53.2059108Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:53.2060401Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:53.2325304Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:53.2325834Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:53.4346405Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:53.4346940Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:16:53.4760056Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:53.4761335Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:16:54.1718274Z ok (7.416s) 2022-09-27T16:16:54.1718490Z 2022-09-27T16:16:54.1718909Z ---------------------------------------------------------------------- 2022-09-27T16:16:54.1719234Z Ran 59 tests in 374.096s 2022-09-27T16:16:54.1719402Z 2022-09-27T16:16:54.1719517Z OK (skipped=5) 2022-09-27T16:16:54.1719671Z 2022-09-27T16:16:54.1720048Z Generating XML reports... 2022-09-27T16:16:54.1811416Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestHooks-20220927161040.xml 2022-09-27T16:16:54.1815962Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestNoGrad-20220927161040.xml 2022-09-27T16:16:54.1822281Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParamInit-20220927161040.xml 2022-09-27T16:16:54.1875389Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParityWithDDP-20220927161040.xml 2022-09-27T16:16:54.5490789Z Running distributed/fsdp/test_fsdp_state_dict ... [2022-09-27 16:16:54.548572] 2022-09-27T16:16:54.5491558Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_state_dict.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:16:54.548653] 2022-09-27T16:16:56.4376879Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_state_dict 2022-09-27T16:16:56.4406761Z 2022-09-27T16:16:56.4407174Z Running tests... 2022-09-27T16:16:56.4407680Z ---------------------------------------------------------------------- 2022-09-27T16:16:56.4425834Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:16:57.9419498Z Tests that we can save a state_dict and load it into a blank model ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:16:57.9605856Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40710 2022-09-27T16:16:57.9612957Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40711 2022-09-27T16:16:59.5797421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:59.5798197Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:59.5799508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:59.5799977Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:59.5997170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:16:59.5997634Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:16:59.6000911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:16:59.6001516Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:16:59.8178186Z dist init r=1, world=2 2022-09-27T16:16:59.8189174Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:16:59.8286229Z dist init r=0, world=2 2022-09-27T16:16:59.8297879Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:16:59.8298737Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:16:59.8393330Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:01.2083202Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:17:01.2083768Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:17:01.7695100Z ok (5.329s) 2022-09-27T16:17:01.7713294Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:17:01.7726929Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40791 2022-09-27T16:17:01.7732980Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40792 2022-09-27T16:17:03.4059947Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:03.4060786Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:03.4061390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:03.4061866Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:03.4162153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:03.4162856Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:03.4165972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:03.4166448Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:03.6464080Z dist init r=0, world=2 2022-09-27T16:17:03.6466112Z dist init r=1, world=2 2022-09-27T16:17:03.6476320Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:17:03.6478387Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:17:03.6479155Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:03.6479822Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:05.0568882Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:17:05.0569430Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:17:05.4809510Z ok (3.711s) 2022-09-27T16:17:05.4827818Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:17:05.4841907Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40872 2022-09-27T16:17:05.4848385Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40873 2022-09-27T16:17:07.1354210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:07.1355009Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:07.1355660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:07.1356128Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:07.1478534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:07.1478996Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:07.1482174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:07.1482636Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:07.3747369Z dist init r=1, world=2 2022-09-27T16:17:07.3755703Z dist init r=0, world=2 2022-09-27T16:17:07.3758692Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:17:07.3766502Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:17:07.3767517Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:07.3862037Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:08.7589852Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:17:08.7590416Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:17:09.2924638Z ok (3.811s) 2022-09-27T16:17:09.2942938Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:17:09.2956390Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40953 2022-09-27T16:17:09.2962712Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40954 2022-09-27T16:17:10.9573082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:10.9573840Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:10.9574855Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:10.9575333Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:10.9810170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:10.9810611Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:10.9813660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:10.9814152Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:11.2029411Z dist init r=1, world=2 2022-09-27T16:17:11.2041527Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:17:11.2112305Z dist init r=0, world=2 2022-09-27T16:17:11.2125816Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:17:11.2126594Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:11.2144097Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:12.6091448Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:17:12.6091990Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:17:13.0038817Z ok (3.711s) 2022-09-27T16:17:13.0056628Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:17:13.0070050Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41034 2022-09-27T16:17:13.0076999Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41035 2022-09-27T16:17:14.6943532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:14.6944038Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:14.6945157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:14.6945608Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:14.7088939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:14.7089583Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:14.7092610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:14.7093090Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:14.9342419Z dist init r=1, world=2 2022-09-27T16:17:14.9352555Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:17:14.9374197Z dist init r=0, world=2 2022-09-27T16:17:14.9386558Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:17:14.9387339Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:14.9455303Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:16.3427088Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:17:16.3427601Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:17:16.9157228Z ok (3.912s) 2022-09-27T16:17:16.9174733Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:17:16.9187949Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41115 2022-09-27T16:17:16.9194330Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41116 2022-09-27T16:17:18.5696271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:18.5696780Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:18.5697828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:18.5698336Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:18.5748220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:18.5748672Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:18.5751989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:18.5752466Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:18.8182428Z dist init r=0, world=2 2022-09-27T16:17:18.8194563Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:17:18.8439111Z dist init r=1, world=2 2022-09-27T16:17:18.8452046Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:17:18.8452892Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:18.8499939Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:20.2607018Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:17:20.2607549Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:17:20.7274295Z ok (3.812s) 2022-09-27T16:17:20.7292441Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:17:20.7305553Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41196 2022-09-27T16:17:20.7312111Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41197 2022-09-27T16:17:22.3514862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:22.3515602Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:22.3516404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:22.3516881Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:22.3598953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:22.3599416Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:22.3602478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:22.3602970Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:22.5813298Z dist init r=1, world=2 2022-09-27T16:17:22.5824041Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:17:22.5917246Z dist init r=0, world=2 2022-09-27T16:17:22.5928890Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:17:22.5929649Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:22.6028226Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:23.9647540Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:17:23.9648097Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:17:24.5391072Z ok (3.811s) 2022-09-27T16:17:24.5409218Z test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:17:24.5423127Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41277 2022-09-27T16:17:24.5429388Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41278 2022-09-27T16:17:26.1597766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:26.1598287Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:26.1598874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:26.1599376Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:26.1752479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:26.1752984Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:26.1756643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:26.1757135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:26.4018705Z dist init r=0, world=2 2022-09-27T16:17:26.4022973Z dist init r=1, world=2 2022-09-27T16:17:26.4028660Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:17:26.4034598Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:17:26.4035760Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:26.4131924Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:27.7829469Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:17:27.7829998Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:17:28.2504504Z ok (3.711s) 2022-09-27T16:17:28.2523234Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:17:28.2536710Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41358 2022-09-27T16:17:28.2543347Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41359 2022-09-27T16:17:29.9344772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:29.9345309Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:29.9346367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:29.9346892Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:29.9466591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:29.9467060Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:29.9469713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:29.9470198Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:30.1721581Z dist init r=1, world=2 2022-09-27T16:17:30.1731767Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:17:30.1744332Z dist init r=0, world=2 2022-09-27T16:17:30.1756996Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:17:30.1757764Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:30.1835466Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:31.5821801Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:17:31.5822325Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:17:31.6657878Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:17:31.6660500Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:17:32.1622375Z ok (3.912s) 2022-09-27T16:17:32.1640838Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:17:32.1654973Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41439 2022-09-27T16:17:32.1661558Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41440 2022-09-27T16:17:33.7879558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:33.7880096Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:33.7880732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:33.7881208Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:33.8388555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:33.8389049Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:33.8391218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:33.8391693Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:34.0205639Z dist init r=1, world=2 2022-09-27T16:17:34.0215863Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:17:34.0612828Z dist init r=0, world=2 2022-09-27T16:17:34.0625011Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:17:34.0625948Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:34.0723403Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:35.4621065Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:17:35.4621625Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:17:35.8735101Z ok (3.711s) 2022-09-27T16:17:35.8752637Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:17:35.8765767Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41520 2022-09-27T16:17:35.8772483Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41521 2022-09-27T16:17:37.5557226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:37.5557761Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:37.5558357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:37.5558832Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:37.5678622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:37.5679109Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:37.5682297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:37.5682778Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:37.7954879Z dist init r=0, world=2 2022-09-27T16:17:37.7966933Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:17:37.7980868Z dist init r=1, world=2 2022-09-27T16:17:37.7993428Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:17:37.7994532Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:37.8069951Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:39.2096557Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:17:39.2097375Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:17:39.7848803Z ok (3.911s) 2022-09-27T16:17:39.7867039Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:17:39.7882255Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41601 2022-09-27T16:17:39.7888933Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41602 2022-09-27T16:17:41.4393092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:41.4393859Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:41.4394697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:41.4395175Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:41.4698256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:41.4698731Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:41.4702158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:41.4702634Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:41.6747537Z dist init r=0, world=2 2022-09-27T16:17:41.6758616Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:17:41.6955669Z dist init r=1, world=2 2022-09-27T16:17:41.6967597Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:17:41.6968351Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:41.7066169Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:43.0721559Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:17:43.0722099Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:17:43.4965376Z ok (3.712s) 2022-09-27T16:17:43.4982856Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:17:43.4996365Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41682 2022-09-27T16:17:43.5002926Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41683 2022-09-27T16:17:45.1464207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:45.1464838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:45.1466056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:45.1466538Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:45.1789906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:45.1790351Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:45.1793322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:45.1793812Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:45.3829103Z dist init r=1, world=2 2022-09-27T16:17:45.3840585Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:17:45.4042385Z dist init r=0, world=2 2022-09-27T16:17:45.4054202Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:17:45.4054976Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:45.4146186Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:46.8184954Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:17:46.8185491Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:17:46.9096107Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:17:46.9098308Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:17:47.4082922Z ok (3.912s) 2022-09-27T16:17:47.4100614Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:17:47.4114802Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41763 2022-09-27T16:17:47.4121024Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41764 2022-09-27T16:17:49.0263684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:49.0264244Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:49.0264841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:49.0265305Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:49.0610184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:49.0610645Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:49.0613738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:49.0614194Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:49.2629792Z dist init r=1, world=2 2022-09-27T16:17:49.2640819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:17:49.2857781Z dist init r=0, world=2 2022-09-27T16:17:49.2869507Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:17:49.2870285Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:49.2945687Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:50.6773056Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:17:50.6773579Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:17:51.1204400Z ok (3.712s) 2022-09-27T16:17:51.1222564Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:17:51.1235901Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41844 2022-09-27T16:17:51.1242231Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41845 2022-09-27T16:17:52.7987815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:52.7988673Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:52.7989827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:52.7990572Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:52.8056262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:52.8056729Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:52.8059696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:52.8060176Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:53.0322920Z dist init r=1, world=2 2022-09-27T16:17:53.0333940Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:17:53.0342444Z dist init r=0, world=2 2022-09-27T16:17:53.0354000Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:17:53.0354851Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:53.0436928Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:54.4289765Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:17:54.4290299Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:17:54.9316923Z ok (3.811s) 2022-09-27T16:17:54.9334384Z test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:17:54.9347628Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41925 2022-09-27T16:17:54.9354033Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41926 2022-09-27T16:17:56.5777325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:56.5777868Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:56.5783367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:56.5784035Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:56.6213882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:17:56.6214618Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:17:56.6217142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:17:56.6217867Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:17:56.8133400Z dist init r=0, world=2 2022-09-27T16:17:56.8143255Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:17:56.8454955Z dist init r=1, world=2 2022-09-27T16:17:56.8467177Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:17:56.8468232Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:56.8549588Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:17:58.2522120Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:17:58.2522642Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:17:58.7428819Z ok (3.811s) 2022-09-27T16:17:58.7447090Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:17:58.7460246Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42006 2022-09-27T16:17:58.7466511Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42007 2022-09-27T16:18:00.4191798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:00.4192503Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:00.4193274Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:00.4193982Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:00.4303176Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:00.4303944Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:00.4306441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:00.4307236Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:00.6647086Z dist init r=1, world=2 2022-09-27T16:18:00.6647443Z dist init r=0, world=2 2022-09-27T16:18:00.6657768Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:18:00.6660076Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:18:00.6661130Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:00.6761464Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:02.0700954Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:18:02.0701512Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:18:02.1533011Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:02.1535461Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:02.1538213Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:02.1540610Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:02.5542010Z ok (3.811s) 2022-09-27T16:18:02.5559897Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:18:02.5572672Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42087 2022-09-27T16:18:02.5579018Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42088 2022-09-27T16:18:04.2240810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:04.2241308Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:04.2242193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:04.2242669Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:04.2388819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:04.2392264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:04.2392853Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:04.2393326Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:04.4676994Z dist init r=0, world=2 2022-09-27T16:18:04.4677688Z dist init r=1, world=2 2022-09-27T16:18:04.4686981Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:18:04.4688053Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:18:04.4689282Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:04.4690744Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:05.8658632Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:18:05.8659477Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:18:05.9505746Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:05.9508272Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:06.3658727Z ok (3.812s) 2022-09-27T16:18:06.3677240Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:18:06.3690668Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42168 2022-09-27T16:18:06.3696764Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42169 2022-09-27T16:18:08.0054392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:08.0054945Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:08.0055542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:08.0056008Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:08.0293652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:08.0294098Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:08.0296972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:08.0297600Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:08.2466128Z dist init r=0, world=2 2022-09-27T16:18:08.2476453Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:18:08.2484740Z dist init r=1, world=2 2022-09-27T16:18:08.2496049Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:18:08.2496837Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:08.2579470Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:09.6342825Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:18:09.6343367Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:18:09.7157540Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:09.7158845Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:09.7160070Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:09.7161305Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:10.1786474Z ok (3.813s) 2022-09-27T16:18:10.1804764Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=False)_fp16_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:18:10.1818658Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42249 2022-09-27T16:18:10.1825446Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42250 2022-09-27T16:18:11.7854631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:11.7855141Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:11.7856886Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:11.7857589Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:11.8191914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:11.8192374Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:11.8196269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:11.8196749Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:12.0222641Z dist init r=1, world=2 2022-09-27T16:18:12.0232977Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:18:12.0449715Z dist init r=0, world=2 2022-09-27T16:18:12.0462149Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:18:12.0462963Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:12.0538252Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:13.4627892Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:18:13.4628415Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:18:13.9903110Z ok (3.812s) 2022-09-27T16:18:13.9921374Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:18:13.9934904Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42330 2022-09-27T16:18:13.9941246Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42331 2022-09-27T16:18:15.6335649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:15.6336552Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:15.6337173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:15.6337649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:15.6479303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:15.6479773Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:15.6483097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:15.6483577Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:15.8746584Z dist init r=0, world=2 2022-09-27T16:18:15.8757223Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:18:15.8778350Z dist init r=1, world=2 2022-09-27T16:18:15.8790747Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:18:15.8791791Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:15.8860055Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:17.2835624Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:18:17.2836184Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:18:17.3728883Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:17.3730187Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:17.8019188Z ok (3.812s) 2022-09-27T16:18:17.8037469Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:18:17.8050582Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42411 2022-09-27T16:18:17.8056944Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42412 2022-09-27T16:18:19.4219122Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:19.4219680Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:19.4220278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:19.4220749Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:19.4447032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:19.4447483Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:19.4451148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:19.4451650Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:19.6629033Z dist init r=1, world=2 2022-09-27T16:18:19.6639993Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:18:19.6731614Z dist init r=0, world=2 2022-09-27T16:18:19.6744933Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:18:19.6745696Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:19.6845254Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:21.0632175Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:18:21.0632702Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:18:21.1619673Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:21.1622947Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:21.6134062Z ok (3.811s) 2022-09-27T16:18:21.6151937Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:18:21.6165620Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42492 2022-09-27T16:18:21.6172091Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42493 2022-09-27T16:18:23.2701319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:23.2701828Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:23.2702752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:23.2703226Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:23.2721677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:23.2722131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:23.2724985Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:23.2725637Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:23.4889600Z dist init r=0, world=2 2022-09-27T16:18:23.4900074Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:18:23.5047354Z dist init r=1, world=2 2022-09-27T16:18:23.5058291Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:18:23.5059138Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:23.5103874Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:24.8720174Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:18:24.8720721Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:18:24.9626475Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:24.9627751Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:24.9628955Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:24.9630191Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:18:25.4262355Z ok (3.813s) 2022-09-27T16:18:25.4280332Z test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload_CPUOffload(offload_params=True)_fp16_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:18:25.4293829Z Tests that we can save a state_dict and load it into a blank model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42573 2022-09-27T16:18:25.4300499Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42574 2022-09-27T16:18:27.0828895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:27.0829877Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:27.0831730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:27.0832715Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:27.0835286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:27.0836192Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:27.0838553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:27.0839063Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:27.3051520Z dist init r=1, world=2 2022-09-27T16:18:27.3061660Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:18:27.3205395Z dist init r=0, world=2 2022-09-27T16:18:27.3218444Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:18:27.3219934Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:27.3266420Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:28.6882970Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:18:28.6883924Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:18:29.2389021Z ok (3.813s) 2022-09-27T16:18:29.2409493Z test_fsdp_state_dict_keys_state_dict_type_local_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42654 2022-09-27T16:18:29.2415926Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42655 2022-09-27T16:18:30.9108893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:30.9109623Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:30.9111584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:30.9112467Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:30.9226073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:30.9226523Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:30.9229298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:30.9229789Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:31.1456036Z dist init r=1, world=2 2022-09-27T16:18:31.1466708Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:18:31.1574265Z dist init r=0, world=2 2022-09-27T16:18:31.1586112Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:18:31.1587014Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:31.1670740Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:32.5563882Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:18:32.5564403Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:18:32.5777258Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:18:32.5778073Z warnings.warn( 2022-09-27T16:18:32.5830193Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:18:32.5831292Z warnings.warn( 2022-09-27T16:18:33.0492242Z ok (3.810s) 2022-09-27T16:18:33.0513363Z test_fsdp_state_dict_keys_state_dict_type_sharded_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42735 2022-09-27T16:18:33.0520863Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42736 2022-09-27T16:18:34.6813109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:34.6813627Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:34.6814723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:34.6815183Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:34.7093210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:34.7093773Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:34.7096375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:34.7096873Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:34.9193963Z dist init r=1, world=2 2022-09-27T16:18:34.9204665Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:18:34.9367334Z dist init r=0, world=2 2022-09-27T16:18:34.9379329Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:18:34.9380115Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:34.9408708Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:36.3336124Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:18:36.3336641Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:18:36.3534618Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:18:36.3535407Z warnings.warn( 2022-09-27T16:18:36.3536516Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:18:36.3537274Z warnings.warn( 2022-09-27T16:18:36.7612348Z ok (3.712s) 2022-09-27T16:18:36.7632816Z test_fsdp_state_dict_keys_state_dict_type_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42816 2022-09-27T16:18:36.7639273Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42817 2022-09-27T16:18:38.3644923Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:38.3645418Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:38.3646621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:38.3647100Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:38.4002732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:38.4003185Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:38.4006146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:38.4006658Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:38.5989238Z dist init r=1, world=2 2022-09-27T16:18:38.5999791Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:18:38.6184126Z dist init r=0, world=2 2022-09-27T16:18:38.6195683Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:18:38.6196480Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:38.6203717Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:39.9865379Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:18:39.9865892Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:18:40.0094317Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:18:40.0095196Z warnings.warn( 2022-09-27T16:18:40.0096316Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:18:40.0097075Z warnings.warn( 2022-09-27T16:18:40.4714728Z ok (3.710s) 2022-09-27T16:18:40.4723389Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_both (__main__.TestFSDPStateDict) 2022-09-27T16:18:40.4737233Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42897 2022-09-27T16:18:40.4743389Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42898 2022-09-27T16:18:42.1131764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:42.1132290Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:42.1133968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:42.1134465Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:42.1231072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:42.1231791Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:42.1234891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:42.1235362Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:42.3458878Z dist init r=0, world=2 2022-09-27T16:18:42.3469242Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:18:42.3508070Z dist init r=1, world=2 2022-09-27T16:18:42.3519686Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:18:42.3520805Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:42.3571985Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:43.7392549Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:18:43.7393060Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:18:44.2821594Z ok (3.811s) 2022-09-27T16:18:44.2832486Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_first (__main__.TestFSDPStateDict) 2022-09-27T16:18:44.2849819Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42978 2022-09-27T16:18:44.2857478Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42979 2022-09-27T16:18:45.9072275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:45.9072796Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:45.9073932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:45.9074391Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:45.9296340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:45.9296800Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:45.9299818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:45.9300302Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:46.1476280Z dist init r=0, world=2 2022-09-27T16:18:46.1486418Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:18:46.1569927Z dist init r=1, world=2 2022-09-27T16:18:46.1581952Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:18:46.1582805Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:46.1588888Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:47.5649262Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:18:47.5649791Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:18:48.0936789Z ok (3.811s) 2022-09-27T16:18:48.0945400Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_second (__main__.TestFSDPStateDict) 2022-09-27T16:18:48.0960970Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43059 2022-09-27T16:18:48.0967708Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43060 2022-09-27T16:18:49.6993967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:49.6994513Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:49.6996275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:49.6996750Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:49.7371368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:49.7371833Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:49.7374989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:49.7375454Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:49.9331127Z dist init r=1, world=2 2022-09-27T16:18:49.9341325Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:18:49.9614840Z dist init r=0, world=2 2022-09-27T16:18:49.9626999Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:18:49.9628102Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:49.9646636Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:51.3760682Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:18:51.3761211Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:18:51.9045253Z ok (3.811s) 2022-09-27T16:18:51.9053003Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_both (__main__.TestFSDPStateDict) 2022-09-27T16:18:51.9068634Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43140 2022-09-27T16:18:51.9075212Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43141 2022-09-27T16:18:53.5198981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:53.5199485Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:53.5200483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:53.5200969Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:53.5355206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:53.5355654Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:53.5358623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:53.5359105Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:53.7616229Z dist init r=1, world=2 2022-09-27T16:18:53.7627049Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:18:53.7679702Z dist init r=0, world=2 2022-09-27T16:18:53.7691558Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:18:53.7692922Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:53.7729916Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:55.1797870Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:18:55.1798384Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:18:55.7150835Z ok (3.810s) 2022-09-27T16:18:55.7159524Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_first (__main__.TestFSDPStateDict) 2022-09-27T16:18:55.7173744Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43221 2022-09-27T16:18:55.7180164Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43222 2022-09-27T16:18:57.3670042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:57.3670545Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:57.3671927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:57.3672420Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:57.4067418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:18:57.4067873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:18:57.4070915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:18:57.4072026Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:18:57.6003703Z dist init r=0, world=2 2022-09-27T16:18:57.6013955Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:18:57.6284764Z dist init r=1, world=2 2022-09-27T16:18:57.6296705Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:18:57.6297478Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:57.6319489Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:18:59.0215603Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:18:59.0216122Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:18:59.5256262Z ok (3.810s) 2022-09-27T16:18:59.5264217Z test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_second (__main__.TestFSDPStateDict) 2022-09-27T16:18:59.5278705Z Tests saving the state dict, zeroing a target model's parameters, and ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43302 2022-09-27T16:18:59.5284514Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43303 2022-09-27T16:19:01.1404801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:01.1405836Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:01.1407039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:01.1408309Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:01.1630295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:01.1631380Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:01.1634098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:01.1635087Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:01.3863263Z dist init r=1, world=2 2022-09-27T16:19:01.3874087Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:19:01.3925953Z dist init r=0, world=2 2022-09-27T16:19:01.3938037Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:19:01.3938841Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:01.3977516Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:02.7900426Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:19:02.7900976Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:19:03.3359758Z ok (3.810s) 2022-09-27T16:19:03.3379407Z test_save_and_load_after_forward_state_dict_state_dict_type_local_state_dict_mixed_precision_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:19:03.3392746Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43383 2022-09-27T16:19:03.3399434Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43384 2022-09-27T16:19:04.9705950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:04.9706630Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:04.9707601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:04.9708091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:04.9767727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:04.9768193Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:04.9770770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:04.9771251Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:05.2052829Z dist init r=1, world=2 2022-09-27T16:19:05.2064032Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:19:05.2098338Z dist init r=0, world=2 2022-09-27T16:19:05.2109659Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:19:05.2110637Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:05.2166593Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:06.6024945Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:19:06.6025458Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:19:07.5496458Z ok (4.214s) 2022-09-27T16:19:07.5521360Z test_save_and_load_after_forward_state_dict_state_dict_type_local_state_dict_mixed_precision_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:19:07.5536813Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43468 2022-09-27T16:19:07.5544101Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43469 2022-09-27T16:19:09.2028097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:09.2028636Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:09.2029432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:09.2029914Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:09.2576044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:09.2576632Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:09.2578016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:09.2578691Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:09.4378399Z dist init r=1, world=2 2022-09-27T16:19:09.4388484Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:19:09.4780391Z dist init r=0, world=2 2022-09-27T16:19:09.4792423Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:19:09.4793216Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:09.4794979Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:10.8933245Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:19:10.8933787Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:19:11.3638134Z ok (3.814s) 2022-09-27T16:19:11.3657927Z test_save_and_load_after_forward_state_dict_state_dict_type_local_state_dict_mixed_precision_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:19:11.3672306Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43549 2022-09-27T16:19:11.3679481Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43550 2022-09-27T16:19:12.9935805Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:12.9936315Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:12.9936907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:12.9937371Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:13.0280061Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:13.0280542Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:13.0283818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:13.0284282Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:13.2296751Z dist init r=0, world=2 2022-09-27T16:19:13.2307531Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:19:13.2561136Z dist init r=1, world=2 2022-09-27T16:19:13.2573072Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:19:13.2573861Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:13.2612954Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:14.6431188Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:19:14.6431752Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:19:15.6782877Z ok (4.314s) 2022-09-27T16:19:15.6803208Z test_save_and_load_after_forward_state_dict_state_dict_type_local_state_dict_mixed_precision_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:19:15.6818027Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43634 2022-09-27T16:19:15.6824970Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43635 2022-09-27T16:19:17.2988858Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:17.2989688Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:17.2990312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:17.2990967Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:17.3334220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:17.3334676Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:17.3337721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:17.3338183Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:17.5340954Z dist init r=1, world=2 2022-09-27T16:19:17.5352263Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:19:17.5597428Z dist init r=0, world=2 2022-09-27T16:19:17.5609897Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:19:17.5610683Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:17.5657407Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:18.9756740Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:19:18.9757624Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:19:19.3916307Z ok (3.713s) 2022-09-27T16:19:19.3936038Z test_save_and_load_after_forward_state_dict_state_dict_type_sharded_state_dict_mixed_precision_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:19:19.3949832Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43715 2022-09-27T16:19:19.3956429Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43716 2022-09-27T16:19:21.0424570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:21.0425247Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:21.0426438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:21.0426932Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:21.0588831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:21.0589278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:21.0592400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:21.0592904Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:21.2860533Z dist init r=1, world=2 2022-09-27T16:19:21.2870709Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:19:21.2892937Z dist init r=0, world=2 2022-09-27T16:19:21.2904959Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:19:21.2905768Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:21.2973427Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:22.6841535Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:19:22.6842341Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:19:23.6057344Z ok (4.214s) 2022-09-27T16:19:23.6077464Z test_save_and_load_after_forward_state_dict_state_dict_type_sharded_state_dict_mixed_precision_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:19:23.6091017Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43800 2022-09-27T16:19:23.6097771Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43801 2022-09-27T16:19:25.2878628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:25.2879142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:25.2879936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:25.2880436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:25.3353393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:25.3353859Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:25.3357164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:25.3357641Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:25.5194531Z dist init r=0, world=2 2022-09-27T16:19:25.5204955Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:19:25.5591804Z dist init r=1, world=2 2022-09-27T16:19:25.5604005Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:19:25.5604773Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:25.5611076Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:26.9495229Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:19:26.9495788Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:19:27.4174808Z ok (3.812s) 2022-09-27T16:19:27.4194778Z test_save_and_load_after_forward_state_dict_state_dict_type_sharded_state_dict_mixed_precision_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:19:27.4208431Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43881 2022-09-27T16:19:27.4214562Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43882 2022-09-27T16:19:29.0732958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:29.0733880Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:29.0735075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:29.0735626Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:29.0800007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:29.0800471Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:29.0803244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:29.0803761Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:29.3093370Z dist init r=1, world=2 2022-09-27T16:19:29.3105361Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:19:29.3126077Z dist init r=0, world=2 2022-09-27T16:19:29.3138174Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:19:29.3138961Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:29.3208112Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:30.7054289Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:19:30.7054847Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:19:31.7302299Z ok (4.313s) 2022-09-27T16:19:31.7322917Z test_save_and_load_after_forward_state_dict_state_dict_type_sharded_state_dict_mixed_precision_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:19:31.7336806Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43966 2022-09-27T16:19:31.7343115Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43967 2022-09-27T16:19:33.4368448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:33.4369217Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:33.4370578Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:33.4371054Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:33.4547144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:33.4547647Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:33.4550518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:33.4551251Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:33.6807586Z dist init r=0, world=2 2022-09-27T16:19:33.6818455Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:19:33.6884774Z dist init r=1, world=2 2022-09-27T16:19:33.6896792Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:19:33.6898000Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:33.6921025Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:35.0616375Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:19:35.0616955Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:19:35.5423039Z ok (3.812s) 2022-09-27T16:19:35.5443539Z test_save_and_load_after_forward_state_dict_state_dict_type_state_dict_mixed_precision_False_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:19:35.5457688Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44047 2022-09-27T16:19:35.5464728Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44048 2022-09-27T16:19:37.2109010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:37.2109502Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:37.2110301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:37.2111348Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:37.2570192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:37.2570668Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:37.2573064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:37.2573555Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:37.4415074Z dist init r=1, world=2 2022-09-27T16:19:37.4424797Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:19:37.4774782Z dist init r=0, world=2 2022-09-27T16:19:37.4786744Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:19:37.4787538Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:37.4831345Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:38.8729980Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:19:38.8730521Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:19:39.8550272Z ok (4.313s) 2022-09-27T16:19:39.8570862Z test_save_and_load_after_forward_state_dict_state_dict_type_state_dict_mixed_precision_False_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:19:39.8584587Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44132 2022-09-27T16:19:39.8591008Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44133 2022-09-27T16:19:41.4933712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:41.4934248Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:41.4935091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:41.4935567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:41.5209748Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:41.5210210Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:41.5213044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:41.5213518Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:41.7351512Z dist init r=0, world=2 2022-09-27T16:19:41.7362399Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:19:41.7475280Z dist init r=1, world=2 2022-09-27T16:19:41.7487002Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:19:41.7488234Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:41.7566317Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:43.1268201Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:19:43.1268766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:19:44.0676367Z ok (4.213s) 2022-09-27T16:19:44.0695871Z test_save_and_load_after_forward_state_dict_state_dict_type_state_dict_mixed_precision_True_state_dict_rank0_and_offload_False (__main__.TestFSDPStateDict) 2022-09-27T16:19:44.0709335Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44217 2022-09-27T16:19:44.0715930Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44218 2022-09-27T16:19:45.6944034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:45.6944574Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:45.6945649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:45.6946142Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:45.7214518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:45.7215005Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:45.7218366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:45.7218829Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:45.9321421Z dist init r=0, world=2 2022-09-27T16:19:45.9331397Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:19:45.9463584Z dist init r=1, world=2 2022-09-27T16:19:45.9475586Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:19:45.9476828Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:45.9535074Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:47.3332826Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:19:47.3333385Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:19:48.2799566Z ok (4.212s) 2022-09-27T16:19:48.2819471Z test_save_and_load_after_forward_state_dict_state_dict_type_state_dict_mixed_precision_True_state_dict_rank0_and_offload_True (__main__.TestFSDPStateDict) 2022-09-27T16:19:48.2833822Z Test that saving after some training results in params being updated as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44302 2022-09-27T16:19:48.2840273Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44303 2022-09-27T16:19:49.9133200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:49.9133693Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:49.9134531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:49.9135007Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:49.9400751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:49.9401230Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:49.9403620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:49.9404104Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:50.1547581Z dist init r=1, world=2 2022-09-27T16:19:50.1558109Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:19:50.1652425Z dist init r=0, world=2 2022-09-27T16:19:50.1664446Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:19:50.1665426Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:50.1764610Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:51.5494791Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:19:51.5495320Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:19:52.4924345Z ok (4.212s) 2022-09-27T16:19:52.4946530Z test_state_dict_load_into_local_module_state_dict_type_sharded_state_dict_state_dict_rank0_and_offload_False_fsdp_root_False (__main__.TestFSDPStateDict) 2022-09-27T16:19:52.4961155Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44387 2022-09-27T16:19:52.4967722Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44388 2022-09-27T16:19:54.1628801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:54.1629351Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:54.1630180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:54.1630661Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:54.1900521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:54.1900992Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:54.1904747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:54.1905233Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:54.4006542Z dist init r=0, world=2 2022-09-27T16:19:54.4016561Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:19:54.4164746Z dist init r=1, world=2 2022-09-27T16:19:54.4177991Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:19:54.4179507Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:54.4221369Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:55.8083646Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:19:55.8084655Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:19:56.8051525Z ok (4.313s) 2022-09-27T16:19:56.8080094Z test_state_dict_load_into_local_module_state_dict_type_sharded_state_dict_state_dict_rank0_and_offload_False_fsdp_root_True (__main__.TestFSDPStateDict) 2022-09-27T16:19:56.8093738Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44472 2022-09-27T16:19:56.8099724Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44473 2022-09-27T16:19:58.4443516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:58.4444055Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:58.4445025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:58.4445505Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:58.4572141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:19:58.4572856Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:19:58.4575322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:19:58.4575799Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:19:58.6808135Z dist init r=1, world=2 2022-09-27T16:19:58.6818032Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:19:58.6850488Z dist init r=0, world=2 2022-09-27T16:19:58.6862033Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:19:58.6864088Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:19:58.6920828Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:00.0873431Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:20:00.0873982Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:20:00.1137333Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:20:00.1138903Z warnings.warn( 2022-09-27T16:20:00.1141265Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:20:00.1142745Z warnings.warn( 2022-09-27T16:20:01.0182535Z ok (4.213s) 2022-09-27T16:20:01.0205222Z test_state_dict_load_into_local_module_state_dict_type_sharded_state_dict_state_dict_rank0_and_offload_True_fsdp_root_False (__main__.TestFSDPStateDict) 2022-09-27T16:20:01.0218890Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44557 2022-09-27T16:20:01.0224970Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44558 2022-09-27T16:20:02.7090086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:02.7090560Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:02.7092886Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:02.7093575Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:02.7299903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:02.7300352Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:02.7303394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:02.7303870Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:02.9556792Z dist init r=1, world=2 2022-09-27T16:20:02.9567164Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:20:02.9655457Z dist init r=0, world=2 2022-09-27T16:20:02.9667540Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:20:02.9668538Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:02.9669673Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:04.3656866Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:20:04.3657368Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:20:04.8303478Z ok (3.812s) 2022-09-27T16:20:04.8326122Z test_state_dict_load_into_local_module_state_dict_type_sharded_state_dict_state_dict_rank0_and_offload_True_fsdp_root_True (__main__.TestFSDPStateDict) 2022-09-27T16:20:04.8340179Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44638 2022-09-27T16:20:04.8346063Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44639 2022-09-27T16:20:06.4139789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:06.4140272Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:06.4141276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:06.4141795Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:06.4936677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:06.4937170Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:06.4938427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:06.4938884Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:06.6260943Z dist init r=0, world=2 2022-09-27T16:20:06.6271093Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:20:06.7146150Z dist init r=1, world=2 2022-09-27T16:20:06.7158783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:20:06.7159572Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:06.7183636Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:08.1033958Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:20:08.1034489Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:20:08.5434337Z ok (3.713s) 2022-09-27T16:20:08.5456160Z test_state_dict_load_into_local_module_state_dict_type_state_dict_state_dict_rank0_and_offload_False_fsdp_root_False (__main__.TestFSDPStateDict) 2022-09-27T16:20:08.5469901Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44719 2022-09-27T16:20:08.5476024Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44720 2022-09-27T16:20:10.2107644Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:10.2108142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:10.2109456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:10.2109921Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:10.2574699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:10.2575425Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:10.2577204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:10.2577666Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:10.4441412Z dist init r=0, world=2 2022-09-27T16:20:10.4451240Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:20:10.4832969Z dist init r=1, world=2 2022-09-27T16:20:10.4845275Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:20:10.4846385Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:10.4857877Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:11.8833545Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:20:11.8834096Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:20:12.8567948Z ok (4.313s) 2022-09-27T16:20:12.8590200Z test_state_dict_load_into_local_module_state_dict_type_state_dict_state_dict_rank0_and_offload_False_fsdp_root_True (__main__.TestFSDPStateDict) 2022-09-27T16:20:12.8605672Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44804 2022-09-27T16:20:12.8612200Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44805 2022-09-27T16:20:14.5037878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:14.5038369Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:14.5039224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:14.5039715Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:14.5327975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:14.5328422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:14.5331644Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:14.5332163Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:14.7459987Z dist init r=1, world=2 2022-09-27T16:20:14.7470197Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:20:14.7618859Z dist init r=0, world=2 2022-09-27T16:20:14.7630855Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:20:14.7632189Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:14.7674860Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:16.1877054Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:20:16.2095649Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:20:16.2097058Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:20:16.2098084Z warnings.warn( 2022-09-27T16:20:16.2099203Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:20:16.2099932Z warnings.warn( 2022-09-27T16:20:17.1701647Z ok (4.313s) 2022-09-27T16:20:17.1723800Z test_state_dict_load_into_local_module_state_dict_type_state_dict_state_dict_rank0_and_offload_True_fsdp_root_False (__main__.TestFSDPStateDict) 2022-09-27T16:20:17.1738130Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44889 2022-09-27T16:20:17.1744246Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44890 2022-09-27T16:20:18.8480583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:18.8481085Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:18.8482063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:18.8482570Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:18.8690313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:18.8690768Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:18.8693501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:18.8693972Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:19.0935162Z dist init r=1, world=2 2022-09-27T16:20:19.0945483Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:20:19.0948661Z dist init r=0, world=2 2022-09-27T16:20:19.0960956Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:20:19.0961728Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:19.1048595Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:20.5054245Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:20:20.5054780Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:20:21.4831837Z ok (4.313s) 2022-09-27T16:20:21.4854771Z test_state_dict_load_into_local_module_state_dict_type_state_dict_state_dict_rank0_and_offload_True_fsdp_root_True (__main__.TestFSDPStateDict) 2022-09-27T16:20:21.4869550Z Tests that FSDP's state_dict can be loaded into a local model. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44974 2022-09-27T16:20:21.4875994Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44975 2022-09-27T16:20:23.1468830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:23.1469352Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:23.1470357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:23.1471125Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:23.1517479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:23.1518197Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:23.1520936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:23.1521411Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:23.3772762Z dist init r=0, world=2 2022-09-27T16:20:23.3782219Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:20:23.3866578Z dist init r=1, world=2 2022-09-27T16:20:23.3878191Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:20:23.3878980Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:23.3884751Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:24.7902009Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:20:24.7902560Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:20:24.8096298Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:20:24.8097078Z warnings.warn( 2022-09-27T16:20:24.8098157Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:20:24.8098922Z warnings.warn( 2022-09-27T16:20:25.7961184Z ok (4.313s) 2022-09-27T16:20:25.7976700Z test_state_dict_rank0_offload_save_load_flow (__main__.TestFSDPStateDict) 2022-09-27T16:20:25.7990528Z Tests saving a model checkpoint only on rank 0 and loading it only ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45059 2022-09-27T16:20:25.7997420Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45060 2022-09-27T16:20:27.4272887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:27.4273866Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:27.4275065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:27.4275978Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:27.4346256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:27.4346774Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:27.4349097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:27.4349569Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:27.6539160Z dist init r=1, world=2 2022-09-27T16:20:27.6549588Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:20:27.6616708Z dist init r=0, world=2 2022-09-27T16:20:27.6627340Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:20:27.6628392Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:27.6651998Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:29.0323730Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:20:29.0324262Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:20:29.7076379Z ok (3.911s) 2022-09-27T16:20:29.7094132Z test_state_dict_save_load_flow_state_dict_type_local_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45140 2022-09-27T16:20:29.7100457Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45141 2022-09-27T16:20:31.3190411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:31.3191794Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:31.3193002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:31.3193926Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:31.3480991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:31.3481939Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:31.3484068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:31.3485020Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:31.5566436Z dist init r=0, world=2 2022-09-27T16:20:31.5578048Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:20:31.5724935Z dist init r=1, world=2 2022-09-27T16:20:31.5737022Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:20:31.5737848Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:31.5781770Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:32.9554059Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:20:32.9554580Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:20:32.9774318Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:20:32.9775115Z warnings.warn( 2022-09-27T16:20:32.9776502Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:20:32.9777272Z warnings.warn( 2022-09-27T16:20:33.4210545Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:20:33.4211100Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:20:33.9185374Z ok (4.211s) 2022-09-27T16:20:33.9202795Z test_state_dict_save_load_flow_state_dict_type_sharded_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45225 2022-09-27T16:20:33.9209025Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45226 2022-09-27T16:20:35.5480053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:35.5480543Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:35.5481849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:35.5482333Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:35.5791863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:35.5792328Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:35.5795613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:35.5796110Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:35.7865068Z dist init r=0, world=2 2022-09-27T16:20:35.7875822Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:20:35.8075152Z dist init r=1, world=2 2022-09-27T16:20:35.8087358Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:20:35.8088416Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:35.8180829Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:37.1871184Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:20:37.1871740Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:20:37.2096386Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:20:37.2097170Z warnings.warn( 2022-09-27T16:20:37.2098260Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:20:37.2099012Z warnings.warn( 2022-09-27T16:20:37.6484362Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:20:37.6484963Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:20:38.1290754Z ok (4.210s) 2022-09-27T16:20:38.1308603Z test_state_dict_save_load_flow_state_dict_type_state_dict (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45310 2022-09-27T16:20:38.1315363Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45311 2022-09-27T16:20:39.7775303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:39.7775823Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:39.7776631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:39.7777352Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:39.7991747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:39.7992213Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:39.7995696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:39.7996159Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:40.0194582Z dist init r=0, world=2 2022-09-27T16:20:40.0205171Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:20:40.0291355Z dist init r=1, world=2 2022-09-27T16:20:40.0303666Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:20:40.0305013Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:40.0307858Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:41.4288323Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:20:41.4288861Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:20:41.4495315Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:20:41.4496128Z warnings.warn( 2022-09-27T16:20:41.4530072Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:20:41.4530843Z warnings.warn( 2022-09-27T16:20:41.8935362Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:20:41.8936130Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:20:42.3398382Z ok (4.211s) 2022-09-27T16:20:42.3434578Z test_state_dict_skip_module_state_dict_type_local_state_dict_double_nest_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45395 2022-09-27T16:20:42.3440763Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45396 2022-09-27T16:20:44.0035462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:44.0036461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:44.0037049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:44.0037525Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:44.0386997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:44.0387459Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:44.0390815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:44.0391713Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:44.2369826Z dist init r=1, world=2 2022-09-27T16:20:44.2380302Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:20:44.2664168Z dist init r=0, world=2 2022-09-27T16:20:44.2677725Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:20:44.2678868Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:44.2685240Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:45.6542597Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:20:45.6543118Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:20:46.5528755Z ok (4.213s) 2022-09-27T16:20:46.5565233Z test_state_dict_skip_module_state_dict_type_sharded_state_dict_double_nest_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45480 2022-09-27T16:20:46.5571144Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45481 2022-09-27T16:20:48.1719143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:48.1719643Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:48.1720432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:48.1720909Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:48.2157283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:48.2157767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:48.2161193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:48.2161682Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:48.4003842Z dist init r=0, world=2 2022-09-27T16:20:48.4013840Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:20:48.4418082Z dist init r=1, world=2 2022-09-27T16:20:48.4430624Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:20:48.4432172Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:48.4521508Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:49.8257079Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:20:49.8257620Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:20:50.7652127Z ok (4.212s) 2022-09-27T16:20:50.7688624Z test_state_dict_skip_module_state_dict_type_state_dict_double_nest_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45565 2022-09-27T16:20:50.7694720Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45566 2022-09-27T16:20:52.4358469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:52.4359433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:52.4360621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:52.4361507Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:52.4872209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:52.4873078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:52.4874972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:52.4875794Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:52.6660521Z dist init r=0, world=2 2022-09-27T16:20:52.6671654Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:20:52.7117519Z dist init r=1, world=2 2022-09-27T16:20:52.7129755Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:20:52.7130858Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:52.7178807Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:54.1005893Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:20:54.1006883Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:20:55.0779218Z ok (4.313s) 2022-09-27T16:20:55.0796994Z test_state_dict_type (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45650 2022-09-27T16:20:55.0804138Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45651 2022-09-27T16:20:56.7761363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:56.7762156Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:56.7762776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:56.7763251Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:56.7930777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:20:56.7931245Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:20:56.7934332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:20:56.7934799Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:20:57.0273418Z dist init r=0, world=2 2022-09-27T16:20:57.0282970Z dist init r=1, world=2 2022-09-27T16:20:57.0283857Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:20:57.0295468Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:20:57.0296290Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:57.0387447Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:20:58.4503072Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:20:58.4503595Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:20:58.8879253Z ok (3.810s) 2022-09-27T16:20:58.8913144Z test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_False_ignore_inner_False (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45731 2022-09-27T16:20:58.8919535Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45732 2022-09-27T16:21:00.5440001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:00.5441074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:00.5441942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:00.5442429Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:00.5614229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:00.5614690Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:00.5617925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:00.5618410Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:00.7856458Z dist init r=1, world=2 2022-09-27T16:21:00.7866687Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:21:00.7956995Z dist init r=0, world=2 2022-09-27T16:21:00.7969284Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:21:00.7970219Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:00.7970913Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:02.1902991Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:02.1903525Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:02.2096573Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:21:02.2097379Z warnings.warn( 2022-09-27T16:21:02.2131351Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:21:02.2132094Z warnings.warn( 2022-09-27T16:21:02.6995554Z ok (3.812s) 2022-09-27T16:21:02.7030677Z test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_False_ignore_inner_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45812 2022-09-27T16:21:02.7036941Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45813 2022-09-27T16:21:04.3082615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:04.3083494Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:04.3084327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:04.3084808Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:04.3354386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:04.3354832Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:04.3357720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:04.3358382Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:04.5445715Z dist init r=1, world=2 2022-09-27T16:21:04.5456434Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:21:04.5615261Z dist init r=0, world=2 2022-09-27T16:21:04.5627334Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:21:04.5628512Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:04.5661467Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:05.9464682Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:05.9465213Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:05.9656037Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1170: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not well-supported: Linear(in_features=4, out_features=4, bias=True) 2022-09-27T16:21:05.9656690Z warnings.warn( 2022-09-27T16:21:05.9658075Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1170: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not well-supported: Linear(in_features=4, out_features=4, bias=True) 2022-09-27T16:21:05.9658742Z warnings.warn( 2022-09-27T16:21:06.4111118Z ok (3.711s) 2022-09-27T16:21:06.4145656Z test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_True_ignore_inner_False (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45893 2022-09-27T16:21:06.4151777Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45894 2022-09-27T16:21:08.0215561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:08.0216338Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:08.0216953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:08.0217686Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:08.0568954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:08.0569431Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:08.0572375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:08.0572885Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:08.2605371Z dist init r=1, world=2 2022-09-27T16:21:08.2616097Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:21:08.2825533Z dist init r=0, world=2 2022-09-27T16:21:08.2837863Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:21:08.2838654Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:08.2921356Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:09.6654889Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:09.6655432Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:09.6856096Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:21:09.6857192Z warnings.warn( 2022-09-27T16:21:09.6858307Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:21:09.6859053Z warnings.warn( 2022-09-27T16:21:10.1225892Z ok (3.711s) 2022-09-27T16:21:10.1260584Z test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_True_ignore_inner_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45974 2022-09-27T16:21:10.1266688Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45975 2022-09-27T16:21:11.7408522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:11.7409221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:11.7410594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:11.7411078Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:11.7636977Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:11.7637462Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:11.7640283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:11.7640767Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:11.9829969Z dist init r=1, world=2 2022-09-27T16:21:11.9841175Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:21:11.9908845Z dist init r=0, world=2 2022-09-27T16:21:11.9920893Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:21:11.9921695Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:11.9942929Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:13.4010060Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:13.4010605Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:13.4214675Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1170: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not well-supported: Linear(in_features=4, out_features=4, bias=True) 2022-09-27T16:21:13.4215347Z warnings.warn( 2022-09-27T16:21:13.4216239Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1170: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not well-supported: Linear(in_features=4, out_features=4, bias=True) 2022-09-27T16:21:13.4216842Z warnings.warn( 2022-09-27T16:21:13.8343643Z ok (3.712s) 2022-09-27T16:21:13.8378264Z test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_False_ignore_inner_False (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46055 2022-09-27T16:21:13.8384570Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46056 2022-09-27T16:21:15.4453793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:15.4454329Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:15.4455470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:15.4455953Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:15.4883039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:15.4883509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:15.4886472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:15.4886966Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:15.6821346Z dist init r=1, world=2 2022-09-27T16:21:15.6832084Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:21:15.7138422Z dist init r=0, world=2 2022-09-27T16:21:15.7149970Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:21:15.7151170Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:15.7238465Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:17.1104573Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:17.1105127Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:17.1296129Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:21:17.1296911Z warnings.warn( 2022-09-27T16:21:17.1298056Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:21:17.1298807Z warnings.warn( 2022-09-27T16:21:17.5458660Z ok (3.711s) 2022-09-27T16:21:17.5493273Z test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_False_ignore_inner_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46136 2022-09-27T16:21:17.5499288Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46137 2022-09-27T16:21:19.1803289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:19.1803789Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:19.1804783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:19.1805296Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:19.1838248Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:19.1838708Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:19.1841609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:19.1842088Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:19.4104706Z dist init r=1, world=2 2022-09-27T16:21:19.4115600Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:21:19.4135862Z dist init r=0, world=2 2022-09-27T16:21:19.4146678Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:21:19.4147550Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:19.4218290Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:20.8075396Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:20.8075953Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:20.8294372Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1170: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not well-supported: Linear(in_features=4, out_features=4, bias=True) 2022-09-27T16:21:20.8295022Z warnings.warn( 2022-09-27T16:21:20.8295913Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1170: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not well-supported: Linear(in_features=4, out_features=4, bias=True) 2022-09-27T16:21:20.8296508Z warnings.warn( 2022-09-27T16:21:21.2572692Z ok (3.711s) 2022-09-27T16:21:21.2607272Z test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_True_ignore_inner_False (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46217 2022-09-27T16:21:21.2613597Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46218 2022-09-27T16:21:22.9246961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:22.9247680Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:22.9248845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:22.9249319Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:22.9389689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:22.9390348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:22.9393389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:22.9393867Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:23.1656518Z dist init r=1, world=2 2022-09-27T16:21:23.1659075Z dist init r=0, world=2 2022-09-27T16:21:23.1667305Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:21:23.1670246Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:21:23.1671192Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:23.1771751Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:24.5550588Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:24.5551592Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:24.5776529Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:21:24.5777337Z warnings.warn( 2022-09-27T16:21:24.5778463Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:21:24.5779207Z warnings.warn( 2022-09-27T16:21:25.0689046Z ok (3.811s) 2022-09-27T16:21:25.0724110Z test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_True_ignore_inner_True (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46298 2022-09-27T16:21:25.0730440Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46299 2022-09-27T16:21:26.6821557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:26.6822069Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:26.6823759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:26.6824244Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:26.7142814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:26.7143290Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:26.7146035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:26.7146516Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:26.9216845Z dist init r=1, world=2 2022-09-27T16:21:26.9227189Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:21:26.9402686Z dist init r=0, world=2 2022-09-27T16:21:26.9414600Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:21:26.9415492Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:26.9431085Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:28.3610511Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:28.3611066Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:28.3856363Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1170: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not well-supported: Linear(in_features=4, out_features=4, bias=True) 2022-09-27T16:21:28.3857029Z warnings.warn( 2022-09-27T16:21:28.3857946Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1170: UserWarning: Trying to ignore the top-level module passed into the FSDP constructor itself will result in all parameters being ignored and is not well-supported: Linear(in_features=4, out_features=4, bias=True) 2022-09-27T16:21:28.3858839Z warnings.warn( 2022-09-27T16:21:28.8808237Z ok (3.812s) 2022-09-27T16:21:28.8826699Z test_wrong_state_dict_config (__main__.TestFSDPStateDict) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46379 2022-09-27T16:21:28.8833519Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46380 2022-09-27T16:21:30.5343121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:30.5343620Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:30.5344420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:30.5344918Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:30.5472250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:30.5472694Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:30.5476232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:30.5476707Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:30.7677409Z dist init r=0, world=2 2022-09-27T16:21:30.7688099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:21:30.7742171Z dist init r=1, world=2 2022-09-27T16:21:30.7753826Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:21:30.7754795Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:30.7790821Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:21:32.1521252Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:32.1521790Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:32.1737550Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:21:32.1738340Z warnings.warn( 2022-09-27T16:21:32.1739749Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:21:32.1740519Z warnings.warn( 2022-09-27T16:21:32.5908168Z ok (3.710s) 2022-09-27T16:21:32.5908450Z 2022-09-27T16:21:32.5908818Z ---------------------------------------------------------------------- 2022-09-27T16:21:32.5909171Z Ran 70 tests in 276.150s 2022-09-27T16:21:32.5909332Z 2022-09-27T16:21:32.5909425Z OK 2022-09-27T16:21:32.5909559Z 2022-09-27T16:21:32.5909694Z Generating XML reports... 2022-09-27T16:21:32.6019553Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_state_dict/TEST-TestFSDPStateDict-20220927161656.xml 2022-09-27T16:21:32.9570663Z Running distributed/_shard/sharded_tensor/test_sharded_tensor ... [2022-09-27 16:21:32.956552] 2022-09-27T16:21:32.9571509Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/test_sharded_tensor.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:21:32.956630] 2022-09-27T16:21:34.9144393Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor 2022-09-27T16:21:34.9180075Z 2022-09-27T16:21:34.9180336Z Running tests... 2022-09-27T16:21:34.9180771Z ---------------------------------------------------------------------- 2022-09-27T16:21:36.4286068Z test_empty (__main__.TestCreateTensorFromParams) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:21:36.4454476Z ok (1.527s) 2022-09-27T16:21:36.4491251Z test_local_tensor (__main__.TestLocalTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46495 2022-09-27T16:21:36.4497622Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46496 2022-09-27T16:21:36.4504336Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 46497 2022-09-27T16:21:36.4511088Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 46498 2022-09-27T16:21:38.0935952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:38.0936610Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:38.0937557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:38.0938030Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:38.1008788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:38.1009239Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:38.1011773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:38.1012266Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:38.1294048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:38.1294497Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:38.1297133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:38.1297616Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:38.1388947Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:38.1389392Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:38.1392653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:38.1393143Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:38.3483150Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:38.3592112Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:21:38.3800916Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:38.3834091Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:21:38.7569648Z skip: Need at least 4 CUDA devices (2.311s) 2022-09-27T16:21:38.7589479Z test_local_tensor_error (__main__.TestLocalTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46631 2022-09-27T16:21:38.7597179Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46632 2022-09-27T16:21:38.7603940Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 46633 2022-09-27T16:21:38.7610709Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 46634 2022-09-27T16:21:40.3919273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:40.3920162Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:40.3920755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:40.3921489Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:40.3922073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:40.3922548Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:40.3923138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:40.3923599Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:40.4131468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:40.4132138Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:40.4134598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:40.4135343Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:40.4487394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:40.4488133Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:40.4490537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:40.4491307Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:40.6792766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:40.6796639Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:21:40.6817585Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:21:40.6906070Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:41.0666135Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:21:41.0686340Z test_collect_local_shard (__main__.TestModuleHookApi) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46767 2022-09-27T16:21:41.0692750Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46768 2022-09-27T16:21:41.0699548Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 46769 2022-09-27T16:21:41.0706323Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 46770 2022-09-27T16:21:42.7297387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:42.7297919Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:42.7299630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:42.7300119Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:42.7434278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:42.7434747Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:42.7437459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:42.7438180Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:42.7632147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:42.7632631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:42.7635566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:42.7636053Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:42.7677201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:42.7677655Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:42.7680752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:42.7681237Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:42.9952127Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:43.0079367Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:21:43.0080574Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:43.0117318Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:21:43.3762224Z skip: Need at least 4 CUDA devices (2.309s) 2022-09-27T16:21:43.3785492Z test_reshard_output (__main__.TestModuleHookApi) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46903 2022-09-27T16:21:43.3791818Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46904 2022-09-27T16:21:43.3798813Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 46905 2022-09-27T16:21:43.3805498Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 46906 2022-09-27T16:21:45.0078572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:45.0079074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:45.0079667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:45.0080148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:45.0418421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:45.0418927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:45.0422074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:45.0422557Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:45.0423159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:45.0423986Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:45.0426659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:45.0427134Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:45.0833415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:45.0833882Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:45.0835592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:45.0836066Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:45.2737236Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:21:45.2959414Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:45.2963161Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:45.3306792Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:21:45.6859947Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:21:45.6878197Z test_create_shard_with_no_placement (__main__.TestShardMetadata) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47039 2022-09-27T16:21:45.6884317Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47040 2022-09-27T16:21:45.6890676Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 47041 2022-09-27T16:21:45.6897629Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 47042 2022-09-27T16:21:47.3126856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:47.3127376Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:47.3128467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:47.3128934Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:47.3129515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:47.3129998Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:47.3130563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:47.3131026Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:47.3337305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:47.3337756Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:47.3340308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:47.3340790Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:47.3428325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:47.3428785Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:47.3431852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:47.3432335Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:47.5653286Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:21:47.5910119Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:47.5911222Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:47.5991444Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:21:47.9954695Z skip: Need at least 4 CUDA devices (2.309s) 2022-09-27T16:21:47.9975580Z test_shard_metadata_init (__main__.TestShardMetadata) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47175 2022-09-27T16:21:47.9982169Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47176 2022-09-27T16:21:47.9988874Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 47177 2022-09-27T16:21:47.9995867Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 47178 2022-09-27T16:21:49.6419670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:49.6420450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:49.6421024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:49.6421470Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:49.6422054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:49.6422526Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:49.6423099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:49.6423561Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:49.6557091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:49.6557539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:49.6558956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:49.6559423Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:49.6614876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:49.6615317Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:49.6618287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:49.6618756Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:49.9073125Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:21:49.9236361Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:49.9246667Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:49.9305897Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:21:50.3051952Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:21:50.3078759Z test_shard_parameter (__main__.TestShardParameter) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47311 2022-09-27T16:21:50.3085753Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47312 2022-09-27T16:21:50.3093205Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 47313 2022-09-27T16:21:50.3100461Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 47314 2022-09-27T16:21:51.9240515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:51.9241036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:51.9242167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:51.9242675Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:51.9475853Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:51.9476319Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:51.9478983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:51.9479469Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:51.9585774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:51.9586394Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:51.9590105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:51.9590569Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:51.9732720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:51.9733178Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:51.9736503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:51.9736964Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:52.1945873Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:21:52.1979114Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:52.2086494Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:52.2216291Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:21:52.6156827Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:21:52.6186557Z test_shard_parameter_errors (__main__.TestShardParameter) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47447 2022-09-27T16:21:52.6193038Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47448 2022-09-27T16:21:52.6201218Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 47449 2022-09-27T16:21:52.6207953Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 47450 2022-09-27T16:21:54.2397511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:54.2398057Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:54.2398631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:54.2399074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:54.2399654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:54.2400121Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:54.2400711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:54.2401159Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:54.2428848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:54.2429313Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:54.2432127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:54.2432832Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:54.2748365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:54.2748829Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:54.2751665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:54.2752123Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:54.4948845Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:54.5089781Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:54.5099828Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:21:54.5256079Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:21:54.9264820Z skip: Need at least 4 CUDA devices (2.311s) 2022-09-27T16:21:54.9287219Z test_shard_tensor (__main__.TestShardTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47583 2022-09-27T16:21:54.9293751Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47584 2022-09-27T16:21:54.9300406Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 47585 2022-09-27T16:21:54.9307124Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 47586 2022-09-27T16:21:56.5535186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:56.5536160Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:56.5537332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:56.5538225Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:56.5797863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:56.5798786Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:56.5800710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:56.5801656Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:56.6104261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:56.6105192Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:56.6107294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:56.6108292Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:56.6563558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:56.6564385Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:56.6565381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:56.6566214Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:56.8169416Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:56.8290215Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:56.8517591Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:21:56.9067233Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:21:57.3364582Z skip: Need at least 4 CUDA devices (2.410s) 2022-09-27T16:21:57.3389883Z test_shard_tensor_errors (__main__.TestShardTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47719 2022-09-27T16:21:57.3396157Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47720 2022-09-27T16:21:57.3402627Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 47721 2022-09-27T16:21:57.3408834Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 47722 2022-09-27T16:21:58.9645524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:58.9646522Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:58.9647651Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:58.9648941Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:58.9877597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:58.9878456Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:58.9880594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:58.9881187Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:59.0900701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:59.0901198Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:59.0901765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:59.0902253Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:59.1147780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:21:59.1148252Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:21:59.1152085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:21:59.1152573Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:21:59.2353845Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:21:59.2398547Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:21:59.3391912Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:21:59.3665657Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:21:59.7464761Z skip: Need at least 4 CUDA devices (2.410s) 2022-09-27T16:21:59.7486055Z test_cleanup (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47855 2022-09-27T16:21:59.7492893Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47856 2022-09-27T16:21:59.7500169Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 47857 2022-09-27T16:21:59.7507152Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 47858 2022-09-27T16:22:01.3732714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:01.3733680Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:01.3734851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:01.3735787Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:01.3805924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:01.3806821Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:01.3809548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:01.3810514Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:01.3855449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:01.3856346Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:01.3858451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:01.3859361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:01.4132602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:01.4133567Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:01.4135115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:01.4136073Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:01.6333132Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:01.6372529Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:01.6490158Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:01.6642088Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:02.0562727Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:22:02.0599686Z test_complete_world_size (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47991 2022-09-27T16:22:02.0606801Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47992 2022-09-27T16:22:02.0613603Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 47993 2022-09-27T16:22:02.0620662Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 47994 2022-09-27T16:22:03.7219574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:03.7220566Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:03.7221729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:03.7222666Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:03.7246947Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:03.7247866Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:03.7251589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:03.7252564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:03.7450228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:03.7451184Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:03.7453043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:03.7454010Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:03.7601903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:03.7603159Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:03.7605363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:03.7606334Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:03.9796204Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:03.9925264Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:04.0010604Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:04.0075274Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:04.3675712Z skip: Need at least 4 CUDA devices (2.311s) 2022-09-27T16:22:04.3690585Z test_create_sharded_tensor_like (__main__.TestShardedTensorChunked) 2022-09-27T16:22:04.3705594Z Test tensor like methods, i.e. torch.zeros_like(...), torch.full_like, etc. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48127 2022-09-27T16:22:04.3712566Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48128 2022-09-27T16:22:04.3719606Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48129 2022-09-27T16:22:04.3726369Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48130 2022-09-27T16:22:06.0995327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:06.0995875Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:06.0997448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:06.0997951Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:06.0998522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:06.0998968Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:06.1001603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:06.1002083Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:06.1308159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:06.1308652Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:06.1311035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:06.1312115Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:06.1474629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:06.1475101Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:06.1477522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:06.1478004Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:06.3547451Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:06.3765138Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:06.3802600Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:06.3875602Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:06.7784760Z skip: Need at least 4 CUDA devices (2.411s) 2022-09-27T16:22:06.7795030Z test_create_sharded_tensor_with_full (__main__.TestShardedTensorChunked) 2022-09-27T16:22:06.7809170Z Test sharded_tensor.full(...) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48263 2022-09-27T16:22:06.7815340Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48264 2022-09-27T16:22:06.7821922Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48265 2022-09-27T16:22:06.7829205Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48266 2022-09-27T16:22:08.4181917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:08.4182898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:08.4184052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:08.4185275Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:08.4361917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:08.4362401Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:08.4365384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:08.4365856Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:08.4436839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:08.4437725Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:08.4440515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:08.4441387Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:08.4649805Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:08.4650776Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:08.4652405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:08.4653335Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:08.6989809Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:08.7003215Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:08.7011212Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:08.7059522Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:09.0885270Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:22:09.0893054Z test_create_sharded_tensor_with_ones (__main__.TestShardedTensorChunked) 2022-09-27T16:22:09.0906056Z Test sharded_tensor.ones(...) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48399 2022-09-27T16:22:09.0912557Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48400 2022-09-27T16:22:09.0919117Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48401 2022-09-27T16:22:09.0925300Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48402 2022-09-27T16:22:10.7192357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:10.7193047Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:10.7194428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:10.7195013Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:10.8039607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:10.8040243Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:10.8041290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:10.8041780Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:10.8159279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:10.8159744Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:10.8163598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:10.8164277Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:10.8316966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:10.8317440Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:10.8320875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:10.8321373Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:10.9760791Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:11.0546374Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:11.0722423Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:11.0827107Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:11.4985863Z skip: Need at least 4 CUDA devices (2.410s) 2022-09-27T16:22:11.5000685Z test_create_sharded_tensor_with_rand (__main__.TestShardedTensorChunked) 2022-09-27T16:22:11.5014875Z Test sharded_tensor.rand(...)/randn(...) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48535 2022-09-27T16:22:11.5021537Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48536 2022-09-27T16:22:11.5028373Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48537 2022-09-27T16:22:11.5035706Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48538 2022-09-27T16:22:13.1682168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:13.1682677Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:13.1683627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:13.1684169Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:13.1740214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:13.1740680Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:13.1743263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:13.1743725Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:13.2257573Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:13.2258559Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:13.2259749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:13.2260749Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:13.2302462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:13.2303452Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:13.2304896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:13.2305780Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:13.4229805Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:13.4359514Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:13.4743004Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:13.4825706Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:13.9099963Z skip: Need at least 4 CUDA devices (2.411s) 2022-09-27T16:22:13.9109183Z test_create_sharded_tensor_with_zeros (__main__.TestShardedTensorChunked) 2022-09-27T16:22:13.9124913Z Test sharded_tensor.zeros(...) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48671 2022-09-27T16:22:13.9132160Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48672 2022-09-27T16:22:13.9139578Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48673 2022-09-27T16:22:13.9147001Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48674 2022-09-27T16:22:15.5477231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:15.5477740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:15.5478643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:15.5479119Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:15.5479714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:15.5480162Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:15.5482908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:15.5483364Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:15.5790348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:15.5791166Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:15.5793928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:15.5794390Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:15.6318669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:15.6319170Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:15.6320826Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:15.6321284Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:15.7988588Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:15.8089999Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:15.8201455Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:15.8807753Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:16.2202548Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:22:16.2211040Z test_gather_even (__main__.TestShardedTensorChunked) 2022-09-27T16:22:16.2237609Z Test _sharded_tensor.gather(...) with evenly distributed._shards ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48807 2022-09-27T16:22:16.2244041Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48808 2022-09-27T16:22:16.2251063Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48809 2022-09-27T16:22:16.2257758Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48810 2022-09-27T16:22:17.8614932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:17.8615483Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:17.8616743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:17.8617538Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:17.8790524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:17.8791241Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:17.8794508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:17.8794989Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:17.8815768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:17.8816383Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:17.8819593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:17.8820106Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:17.8901609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:17.8902071Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:17.8904946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:17.8905424Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:18.1391832Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:18.1468966Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:18.1471945Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:18.1485269Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:18.5319660Z skip: Need at least 4 CUDA devices (2.311s) 2022-09-27T16:22:18.5336161Z test_gather_uneven (__main__.TestShardedTensorChunked) 2022-09-27T16:22:18.5356919Z Test _sharded_tensor.gather(...) with unevenly distributed._shards ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48943 2022-09-27T16:22:18.5365247Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48944 2022-09-27T16:22:18.5374448Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 48945 2022-09-27T16:22:18.5383391Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 48946 2022-09-27T16:22:20.1745281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:20.1745802Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:20.1746698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:20.1747198Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:20.2015866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:20.2016357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:20.2018745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:20.2019221Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:20.2352673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:20.2353151Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:20.2355771Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:20.2356467Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:20.2773955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:20.2774725Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:20.2775559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:20.2776027Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:20.4393386Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:20.4450867Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:20.4743117Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:20.5233476Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:20.9443490Z skip: Need at least 4 CUDA devices (2.412s) 2022-09-27T16:22:20.9469718Z test_insufficient_sharding_dims (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49079 2022-09-27T16:22:20.9477426Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49080 2022-09-27T16:22:20.9484595Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 49081 2022-09-27T16:22:20.9491903Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 49082 2022-09-27T16:22:22.5923104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:22.5923619Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:22.5924424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:22.5924903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:22.5944859Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:22.5945318Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:22.5948880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:22.5949339Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:22.6013588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:22.6014051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:22.6017042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:22.6017507Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:22.6114887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:22.6115364Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:22.6118747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:22.6119224Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:22.8458634Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:22.8560253Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:22.8584057Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:22.8646023Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:23.2547762Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:22:23.2570127Z test_invalid_pg_rpc_ranks (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49215 2022-09-27T16:22:23.2577220Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49216 2022-09-27T16:22:23.2584833Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 49217 2022-09-27T16:22:23.2592004Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 49218 2022-09-27T16:22:24.8931311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:24.8931816Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:24.8932394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:24.8932865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:24.8933430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:24.8933901Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:24.8934502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:24.8934956Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:24.8949942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:24.8950393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:24.8953506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:24.8953980Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:24.9072872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:24.9073339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:24.9077029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:24.9077504Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:25.1472881Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:25.1626791Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:25.1676223Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:25.2026671Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:25.5649209Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:22:25.5682817Z test_invalid_sharding (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49351 2022-09-27T16:22:25.5688884Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49352 2022-09-27T16:22:25.5695573Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 49353 2022-09-27T16:22:25.5702500Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 49354 2022-09-27T16:22:27.2948163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:27.2948663Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:27.2949773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:27.2950240Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:27.2969450Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:27.2969922Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:27.2973270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:27.2973737Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:27.2992556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:27.2993025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:27.2998212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:27.2998680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:27.3352662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:27.3353154Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:27.3354822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:27.3355276Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:27.5499662Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:27.5583033Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:27.5619723Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:27.5753831Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:27.9759510Z skip: Need at least 4 CUDA devices (2.411s) 2022-09-27T16:22:27.9785458Z test_load_state_dict_errors (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49487 2022-09-27T16:22:27.9791767Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49488 2022-09-27T16:22:27.9798756Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 49489 2022-09-27T16:22:27.9805531Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 49490 2022-09-27T16:22:29.6847816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:29.6848814Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:29.6850013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:29.6850894Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:29.7086431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:29.7087323Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:29.7089601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:29.7090510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:29.7294648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:29.7295582Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:29.7297871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:29.7298823Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:29.7603356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:29.7604754Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:29.7605972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:29.7606881Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:29.9606625Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:29.9634759Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:29.9710608Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:30.0007374Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:30.3863702Z skip: Need at least 4 CUDA devices (2.410s) 2022-09-27T16:22:30.3892157Z test_multiple_local_shards (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49623 2022-09-27T16:22:30.3898512Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49624 2022-09-27T16:22:30.3905377Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 49625 2022-09-27T16:22:30.3911975Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 49626 2022-09-27T16:22:32.0175118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:32.0175667Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:32.0176470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:32.0176943Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:32.0195909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:32.0196496Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:32.0199493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:32.0199971Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:32.0430595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:32.0431293Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:32.0434196Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:32.0434664Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:32.0625337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:32.0625799Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:32.0628268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:32.0628764Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:32.2661019Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:32.2895567Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:32.2896057Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:32.3021488Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:32.6967116Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:22:32.6997699Z test_new_group (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49759 2022-09-27T16:22:32.7004252Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49760 2022-09-27T16:22:32.7010860Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 49761 2022-09-27T16:22:32.7017528Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 49762 2022-09-27T16:22:34.3923819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:34.3924333Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:34.3925383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:34.3925901Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:34.4289070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:34.4289558Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:34.4292362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:34.4292842Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:34.4636411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:34.4636884Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:34.4637961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:34.4638430Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:34.5544251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:34.5544730Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:34.5545725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:34.5546205Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:34.6502220Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:34.6674201Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:34.7010647Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:34.8057253Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:35.2077661Z skip: Need at least 4 CUDA devices (2.511s) 2022-09-27T16:22:35.2108715Z test_partial_world_size (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49895 2022-09-27T16:22:35.2116108Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49896 2022-09-27T16:22:35.2123833Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 49897 2022-09-27T16:22:35.2131196Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 49898 2022-09-27T16:22:36.8000564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:36.8001085Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:36.8001676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:36.8002142Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:36.8475542Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:36.8476018Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:36.8478120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:36.8478600Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:36.8592237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:36.8592700Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:36.8595488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:36.8595965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:36.8848968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:36.8849615Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:36.8850836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:36.8851295Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:37.0554021Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:37.0903445Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:37.1017165Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:37.1234557Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:37.5194500Z skip: Need at least 4 CUDA devices (2.312s) 2022-09-27T16:22:37.5222569Z test_sharded_tensor_metadata (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50031 2022-09-27T16:22:37.5229055Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50032 2022-09-27T16:22:37.5236126Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50033 2022-09-27T16:22:37.5242713Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50034 2022-09-27T16:22:39.2040744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:39.2041261Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:39.2042234Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:39.2042690Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:39.2062287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:39.2062746Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:39.2066196Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:39.2066657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:39.2352378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:39.2353066Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:39.2356120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:39.2356588Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:39.2450151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:39.2450612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:39.2453689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:39.2454296Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:39.4605908Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:39.4719832Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:39.4833487Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:39.5019654Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:39.9300607Z skip: Need at least 4 CUDA devices (2.410s) 2022-09-27T16:22:39.9332848Z test_sharded_tensor_sizes (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50167 2022-09-27T16:22:39.9339725Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50168 2022-09-27T16:22:39.9346771Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50169 2022-09-27T16:22:39.9354122Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50170 2022-09-27T16:22:41.5747008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:41.5747525Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:41.5748716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:41.5749181Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:41.5958808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:41.5959273Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:41.5960391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:41.5960833Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:41.5961718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:41.5962184Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:41.5964488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:41.5964942Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:41.6194622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:41.6195074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:41.6197965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:41.6198432Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:41.8499309Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:41.8572724Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:41.8578921Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:41.8612624Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:42.2409039Z skip: Need at least 4 CUDA devices (2.311s) 2022-09-27T16:22:42.2433783Z test_sharding_columns (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50303 2022-09-27T16:22:42.2440689Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50304 2022-09-27T16:22:42.2447409Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50305 2022-09-27T16:22:42.2454393Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50306 2022-09-27T16:22:43.8698365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:43.8699347Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:43.8700500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:43.8701399Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:43.8740297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:43.8741194Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:43.8744141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:43.8745143Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:43.9196050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:43.9196910Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:43.9197940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:43.9198748Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:43.9303571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:43.9304431Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:43.9305775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:43.9306610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:44.1185636Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:44.1285075Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:44.1680198Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:44.1728050Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:44.5511073Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:22:44.5535574Z test_state_dict (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50439 2022-09-27T16:22:44.5542073Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50440 2022-09-27T16:22:44.5548907Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50441 2022-09-27T16:22:44.5556202Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50442 2022-09-27T16:22:46.1833635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:46.1834384Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:46.1835512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:46.1835987Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:46.1885192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:46.1885645Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:46.1888598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:46.1889076Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:46.2389125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:46.2389785Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:46.2391120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:46.2391621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:46.2616377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:46.2616992Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:46.2617850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:46.2618325Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:46.4328481Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:46.4429961Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:46.4817917Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:46.5095361Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:46.8611079Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:22:46.8634700Z test_state_dict_new_group (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50575 2022-09-27T16:22:46.8640833Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50576 2022-09-27T16:22:46.8647271Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50577 2022-09-27T16:22:46.8654026Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50578 2022-09-27T16:22:48.4952727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:48.4954121Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:48.4954720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:48.4955177Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:48.5005972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:48.5006429Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:48.5009237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:48.5009697Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:48.5307327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:48.5307780Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:48.5310860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:48.5311614Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:48.5946481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:48.5946958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:48.5947817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:48.5948260Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:48.7502678Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:48.7543629Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:48.7712129Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:48.8323149Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:49.1711823Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:22:49.1734035Z test_state_dict_no_sharded_tensors (__main__.TestShardedTensorChunked) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50711 2022-09-27T16:22:49.1749454Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50712 2022-09-27T16:22:49.1750003Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50713 2022-09-27T16:22:49.1755345Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50714 2022-09-27T16:22:50.8040598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:50.8041426Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:50.8042049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:50.8042771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:50.8478760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:50.8479407Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:50.8481346Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:50.8481827Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:50.8536262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:50.8536741Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:50.8539767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:50.8540247Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:50.8873002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:50.8873799Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:50.8875768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:50.8876245Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:51.0584183Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:51.0904207Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:51.0961132Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:51.1340890Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:51.5812791Z skip: Need at least 4 CUDA devices (2.410s) 2022-09-27T16:22:51.5834844Z test_custom_op (__main__.TestShardedTensorCustomOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50847 2022-09-27T16:22:51.5842052Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50848 2022-09-27T16:22:51.5849117Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50849 2022-09-27T16:22:51.5856358Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50850 2022-09-27T16:22:53.2208419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:53.2209177Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:53.2210085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:53.2210562Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:53.2623982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:53.2624456Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:53.2627437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:53.2627912Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:53.2826525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:53.2827150Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:53.2829680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:53.2830162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:53.3060897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:53.3061412Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:53.3062563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:53.3063038Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:53.4851828Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:53.5055961Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:53.5394373Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:53.5618899Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:53.9913443Z skip: Need at least 4 CUDA devices (2.410s) 2022-09-27T16:22:53.9933976Z test_custom_op_errors (__main__.TestShardedTensorCustomOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50983 2022-09-27T16:22:53.9941109Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50984 2022-09-27T16:22:53.9948079Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50985 2022-09-27T16:22:53.9956391Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50986 2022-09-27T16:22:55.6345404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:55.6346363Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:55.6347525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:55.6348725Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:55.6678963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:55.6679898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:55.6681856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:55.6682828Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:55.6836491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:55.6837412Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:55.6839803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:55.6840788Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:55.7155432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:55.7156250Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:55.7157633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:55.7158438Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:55.9035197Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:55.9109922Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:55.9236714Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:55.9621354Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:56.4011406Z skip: Need at least 4 CUDA devices (2.410s) 2022-09-27T16:22:56.4033228Z test_custom_op_override (__main__.TestShardedTensorCustomOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51119 2022-09-27T16:22:56.4039998Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51120 2022-09-27T16:22:56.4047127Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 51121 2022-09-27T16:22:56.4054558Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 51122 2022-09-27T16:22:58.0365426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:58.0365960Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:58.0367117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:58.0367635Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:58.0406964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:58.0407444Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:58.0410253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:58.0410736Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:58.0706342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:58.0706803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:58.0709560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:58.0710038Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:58.1204037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:22:58.1204545Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:22:58.1205349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:22:58.1205819Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:22:58.2912678Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:22:58.3034511Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:22:58.3110467Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:22:58.3683901Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:22:58.7108123Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:22:58.7120060Z test_create_sharded_tensor_with_ones (__main__.TestShardedTensorEnumerable) 2022-09-27T16:22:58.7134510Z Test sharded_tensor.ones(...) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51255 2022-09-27T16:22:58.7140905Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51256 2022-09-27T16:22:58.7147220Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 51257 2022-09-27T16:22:58.7154792Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 51258 2022-09-27T16:23:00.3384258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:00.3385253Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:00.3386464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:00.3387392Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:00.3515458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:00.3516353Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:00.3519324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:00.3520273Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:00.4112047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:00.4112943Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:00.4113960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:00.4114747Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:00.4233813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:00.4234805Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:00.4236118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:00.4236645Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:00.5929881Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:00.5994455Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:00.6553020Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:00.6733652Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:01.0211986Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:23:01.0222707Z test_gather_even (__main__.TestShardedTensorEnumerable) 2022-09-27T16:23:01.0237144Z Test _sharded_tensor.gather(...) with evenly distributed._shards ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51391 2022-09-27T16:23:01.0243286Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51392 2022-09-27T16:23:01.0249620Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 51393 2022-09-27T16:23:01.0256242Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 51394 2022-09-27T16:23:02.6555172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:02.6556399Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:02.6557772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:02.6558278Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:02.6649127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:02.6649595Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:02.6652816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:02.6653281Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:02.6732820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:02.6733281Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:02.6736735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:02.6737208Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:02.7029650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:02.7030118Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:02.7033254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:02.7033718Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:02.9156745Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:02.9299876Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:02.9310428Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:02.9539466Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:03.3312887Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:23:03.3324273Z test_gather_uneven (__main__.TestShardedTensorEnumerable) 2022-09-27T16:23:03.3337676Z Test _sharded_tensor.gather(...) with unevenly distributed._shards ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51527 2022-09-27T16:23:03.3343997Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51528 2022-09-27T16:23:03.3350421Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 51529 2022-09-27T16:23:03.3356913Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 51530 2022-09-27T16:23:04.9592594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:04.9593530Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:04.9594701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:04.9595212Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:04.9753693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:04.9754157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:04.9758175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:04.9758668Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:04.9957763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:04.9958226Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:04.9961274Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:04.9961736Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:05.0287198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:05.0288089Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:05.0289142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:05.0289594Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:05.2317246Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:05.2317787Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:05.2511644Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:05.3140994Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:05.6412095Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:23:05.6445506Z test_grid_sharding (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51663 2022-09-27T16:23:05.6451571Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51664 2022-09-27T16:23:05.6457829Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 51665 2022-09-27T16:23:05.6464375Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 51666 2022-09-27T16:23:07.3072979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:07.3073470Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:07.3074566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:07.3075057Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:07.3166015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:07.3166457Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:07.3169029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:07.3169508Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:07.3420831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:07.3421268Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:07.3424455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:07.3425152Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:07.3689164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:07.3689643Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:07.3691021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:07.3691484Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:07.5681222Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:07.5775068Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:07.5909865Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:07.6213217Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:08.0596892Z skip: Need at least 4 CUDA devices (2.418s) 2022-09-27T16:23:08.0632969Z test_multiple_local_shards (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51799 2022-09-27T16:23:08.0639669Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51800 2022-09-27T16:23:08.0646250Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 51801 2022-09-27T16:23:08.0653115Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 51802 2022-09-27T16:23:09.6930666Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:09.6931641Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:09.6932866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:09.6933784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:09.7385159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:09.7385634Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:09.7388071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:09.7388556Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:09.7569518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:09.7570520Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:09.7572413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:09.7573434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:09.7796381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:09.7796869Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:09.7799001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:09.7799481Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:09.9763048Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:09.9822406Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:10.0019353Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:10.0275766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:10.4711612Z skip: Need at least 4 CUDA devices (2.411s) 2022-09-27T16:23:10.4746917Z test_new_group (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51935 2022-09-27T16:23:10.4753749Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51936 2022-09-27T16:23:10.4761520Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 51937 2022-09-27T16:23:10.4768928Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 51938 2022-09-27T16:23:12.1678788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:12.1679298Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:12.1680168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:12.1680903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:12.2538073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:12.2538555Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:12.2539991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:12.2540701Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:12.2658698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:12.2659143Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:12.2662099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:12.2662577Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:12.2772911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:12.2773356Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:12.2775952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:12.2776438Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:12.4211711Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:12.5035444Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:12.5199489Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:12.5256941Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:12.8824473Z skip: Need at least 4 CUDA devices (2.411s) 2022-09-27T16:23:12.8859030Z test_partial_world_size (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52071 2022-09-27T16:23:12.8865513Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52072 2022-09-27T16:23:12.8872653Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52073 2022-09-27T16:23:12.8879726Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52074 2022-09-27T16:23:14.4975725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:14.4976261Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:14.4976842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:14.4977328Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:14.5330258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:14.5330773Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:14.5333101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:14.5333582Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:14.5344580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:14.5345051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:14.5348230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:14.5731740Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:14.5733056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:14.5733547Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:14.5734429Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:14.5734917Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:14.7648864Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:14.7817552Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:14.7849228Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:14.8212045Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:15.1936170Z skip: Need at least 4 CUDA devices (2.311s) 2022-09-27T16:23:15.1957061Z test_sharded_tensor_device (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52207 2022-09-27T16:23:15.1964154Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52208 2022-09-27T16:23:15.1970696Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52209 2022-09-27T16:23:15.1977710Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52210 2022-09-27T16:23:16.8360871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:16.8361388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:16.8362747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:16.8363229Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:16.8387717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:16.8388306Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:16.8391561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:16.8392049Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:16.8399569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:16.8400025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:16.8403380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:16.8403865Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:16.8515093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:16.8515773Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:16.8518525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:16.8519009Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:17.1010021Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:17.1010537Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:17.1040980Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:17.1087387Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:17.5034375Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:23:17.5068041Z test_sharded_tensor_metadata (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52343 2022-09-27T16:23:17.5074894Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52344 2022-09-27T16:23:17.5081918Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52345 2022-09-27T16:23:17.5088764Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52346 2022-09-27T16:23:19.1487114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:19.1488114Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:19.1489283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:19.1490207Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:19.1761959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:19.1762857Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:19.1763945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:19.1764805Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:19.2054751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:19.2055716Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:19.2057699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:19.2058666Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:19.2098412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:19.2099425Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:19.2101093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:19.2102103Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:19.4162690Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:19.4352229Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:19.4487706Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:19.4520511Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:19.8143657Z skip: Need at least 4 CUDA devices (2.311s) 2022-09-27T16:23:19.8180786Z test_sharded_tensor_to_cpu (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52479 2022-09-27T16:23:19.8187620Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52480 2022-09-27T16:23:19.8194719Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52481 2022-09-27T16:23:19.8201400Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52482 2022-09-27T16:23:21.4446575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:21.4447111Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:21.4448329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:21.4448803Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:21.4469791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:21.4470486Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:21.4473805Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:21.4474287Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:21.4495070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:21.4495519Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:21.4498652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:21.4499125Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:21.4719198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:21.4719685Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:21.4723440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:21.4723918Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:21.7057161Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:21.7057729Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:21.7112826Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:21.7216991Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:22.1257166Z skip: Need at least 4 CUDA devices (2.311s) 2022-09-27T16:23:22.1287026Z test_sharded_tensor_to_cuda (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52615 2022-09-27T16:23:22.1293461Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52616 2022-09-27T16:23:22.1300695Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52617 2022-09-27T16:23:22.1307114Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52618 2022-09-27T16:23:23.7519549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:23.7520038Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:23.7521066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:23.7521546Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:23.7575202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:23.7575661Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:23.7578530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:23.7579039Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:23.8159148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:23.8159634Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:23.8162113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:23.8162594Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:23.8225100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:23.8225831Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:23.8228052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:23.8228539Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:24.0047080Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:24.0093460Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:24.0606367Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:24.0727782Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:24.4363336Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:23:24.4402463Z test_sharded_tensor_to_test (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52751 2022-09-27T16:23:24.4408306Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52752 2022-09-27T16:23:24.4414982Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52753 2022-09-27T16:23:24.4421649Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52754 2022-09-27T16:23:26.1032852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:26.1033871Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:26.1035034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:26.1035900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:26.1037076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:26.1038038Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:26.1039223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:26.1040165Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:26.1044584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:26.1045502Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:26.1047459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:26.1048410Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:26.1213205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:26.1214141Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:26.1215728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:26.1216928Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:26.3639522Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:26.3711954Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:26.3721338Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:26.3734316Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:26.7481725Z skip: Need at least 4 CUDA devices (2.312s) 2022-09-27T16:23:26.7516740Z test_uneven_shards (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52887 2022-09-27T16:23:26.7523133Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52888 2022-09-27T16:23:26.7529797Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52889 2022-09-27T16:23:26.7536865Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52890 2022-09-27T16:23:28.3847415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:28.3848399Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:28.3849615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:28.3850520Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:28.4154484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:28.4155372Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:28.4159182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:28.4160149Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:28.4746040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:28.4746895Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:28.4747920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:28.4748726Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:28.4967802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:28.4968771Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:28.4969984Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:28.4970963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:28.6529020Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:28.6584395Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:28.7195667Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:28.7441552Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:29.1593976Z skip: Need at least 4 CUDA devices (2.411s) 2022-09-27T16:23:29.1627300Z test_with_rpc_names (__main__.TestShardedTensorEnumerable) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53023 2022-09-27T16:23:29.1633702Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53024 2022-09-27T16:23:29.1640263Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53025 2022-09-27T16:23:29.1647124Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53026 2022-09-27T16:23:30.7986398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:30.7986905Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:30.7987941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:30.7988427Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:30.8054494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:30.8054956Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:30.8058599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:30.8059336Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:30.8110703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:30.8111376Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:30.8115074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:30.8115549Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:30.8132890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:30.8133347Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:30.8136398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:30.8136882Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:31.0622056Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:31.0677040Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:31.0709333Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:31.0796887Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:31.4708763Z skip: Need at least 4 CUDA devices (2.311s) 2022-09-27T16:23:31.4732117Z test_init_from_local_shards (__main__.TestShardedTensorFromLocalShards) ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78068 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.002s) 2022-09-27T16:23:31.4769772Z test_init_from_local_shards_and_global_metadata (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53159 2022-09-27T16:23:31.4776509Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53160 2022-09-27T16:23:31.4784034Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53161 2022-09-27T16:23:31.4791693Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53162 2022-09-27T16:23:33.1319397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:33.1319900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:33.1323118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:33.1323618Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:33.1548804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:33.1549640Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:33.1552200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:33.1552667Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:33.1642535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:33.1643003Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:33.1646139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:33.1646599Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:33.1714714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:33.1715180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:33.1718180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:33.1718658Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:33.4236202Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:33.4249985Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:33.4287745Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:33.4291099Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:33.7846783Z skip: Need at least 4 CUDA devices (2.311s) 2022-09-27T16:23:33.7888160Z test_init_from_local_shards_and_global_metadata_invalid_shards (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53295 2022-09-27T16:23:33.7894130Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53296 2022-09-27T16:23:33.7900659Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53297 2022-09-27T16:23:33.7907248Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53298 2022-09-27T16:23:35.4101058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:35.4101592Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:35.4103364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:35.4103872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:35.4276861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:35.4277340Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:35.4280435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:35.4280916Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:35.4477670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:35.4478144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:35.4480991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:35.4481475Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:35.4526983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:35.4527682Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:35.4530426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:35.4530909Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:35.6879334Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:35.6906064Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:35.6937096Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:35.7059112Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:36.0966632Z skip: Need at least 4 CUDA devices (2.312s) 2022-09-27T16:23:36.0994485Z test_init_from_local_shards_invalid_local_shards (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53431 2022-09-27T16:23:36.1001136Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53432 2022-09-27T16:23:36.1007797Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53433 2022-09-27T16:23:36.1014860Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53434 2022-09-27T16:23:37.7215517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:37.7216514Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:37.7217710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:37.7218616Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:37.7371904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:37.7372808Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:37.7376066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:37.7377023Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:37.7596971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:37.7597928Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:37.7599853Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:37.7600789Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:37.8128928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:37.8129954Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:37.8131181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:37.8132104Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:37.9914470Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:37.9918854Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:38.0014736Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:38.0596781Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:38.4069523Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:23:38.4093246Z test_init_from_local_shards_invalid_pin_memory (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53567 2022-09-27T16:23:38.4099444Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53568 2022-09-27T16:23:38.4105732Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53569 2022-09-27T16:23:38.4112918Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53570 2022-09-27T16:23:40.0410436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:40.0410962Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:40.0412035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:40.0412563Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:40.0423142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:40.0423618Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:40.0426415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:40.0426897Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:40.0561041Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:40.0561502Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:40.0564279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:40.0564761Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:40.0691895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:40.0692351Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:40.0695452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:40.0695934Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:40.2920089Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:40.3085319Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:40.3111098Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:40.3199544Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:40.3416231Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:23:40.3517303Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-09-27T16:23:40.3517826Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:23:40.3518294Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-09-27T16:23:40.3519065Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-09-27T16:23:40.3519762Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-09-27T16:23:40.3520446Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-09-27T16:23:40.3521109Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-09-27T16:23:40.7174411Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:23:40.7203426Z test_init_from_local_shards_invalid_property_cross_ranks (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53715 2022-09-27T16:23:40.7209138Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53716 2022-09-27T16:23:40.7215468Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53717 2022-09-27T16:23:40.7221840Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53718 2022-09-27T16:23:42.3806833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:42.3807334Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:42.3808503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:42.3809225Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:42.3817985Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:42.3818455Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:42.3821858Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:42.3822327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:42.4187279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:42.4187745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:42.4190400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:42.4191042Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:42.4487198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:42.4487700Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:42.4489413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:42.4489909Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:42.6189299Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:42.6476681Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:42.6570801Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:42.6914368Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:43.1289480Z skip: Need at least 4 CUDA devices (2.411s) 2022-09-27T16:23:43.1310974Z test_init_from_local_shards_invalid_shards_gaps (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53851 2022-09-27T16:23:43.1318441Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53852 2022-09-27T16:23:43.1325821Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53853 2022-09-27T16:23:43.1333254Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53854 2022-09-27T16:23:44.7723840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:44.7724655Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:44.7725264Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:44.7725749Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:44.7939211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:44.7939866Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:44.7943064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:44.7943546Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:44.7949325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:44.7949767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:44.7952828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:44.7953307Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:44.7972495Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:44.7972941Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:44.7976216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:44.7976691Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:45.0445282Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:45.0543087Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:45.0591193Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:45.0619282Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:45.4389770Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:23:45.4410574Z test_init_from_local_shards_invalid_shards_overlap (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53987 2022-09-27T16:23:45.4416774Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53988 2022-09-27T16:23:45.4423425Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53989 2022-09-27T16:23:45.4429832Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53990 2022-09-27T16:23:47.1285917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:47.1286444Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:47.1287599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:47.1288115Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:47.1718582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:47.1719078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:47.1721812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:47.1722346Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:47.2016469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:47.2016967Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:47.2018023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:47.2018514Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:47.2243735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:47.2244516Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:47.2245569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:47.2246079Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:47.3810873Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:47.4104664Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:47.4538399Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:47.4627486Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:47.8488639Z skip: Need at least 4 CUDA devices (2.410s) 2022-09-27T16:23:47.8515679Z test_init_from_local_shards_new_group (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54123 2022-09-27T16:23:47.8522159Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54124 2022-09-27T16:23:47.8528698Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54125 2022-09-27T16:23:47.8535739Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54126 2022-09-27T16:23:49.4847416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:49.4848088Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:49.4849283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:49.4849770Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:49.4960753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:49.4961245Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:49.4964362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:49.4964837Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:49.5629617Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:49.5630418Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:49.5631475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:49.5631944Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:49.5651698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:49.5652324Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:49.5655486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:49.5655959Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:49.7421456Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:49.7463834Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:49.8142243Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:49.8195602Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:50.2592940Z skip: Need at least 4 CUDA devices (2.410s) 2022-09-27T16:23:50.2614935Z test_local_shards (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54259 2022-09-27T16:23:50.2621405Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54260 2022-09-27T16:23:50.2627403Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54261 2022-09-27T16:23:50.2634171Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54262 2022-09-27T16:23:51.8931522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:51.8932550Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:51.8933762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:51.8934704Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:51.9231913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:51.9233196Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:51.9234905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:51.9235877Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:51.9739131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:51.9740115Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:51.9741305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:51.9742188Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:51.9779177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:51.9780097Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:51.9782349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:51.9783257Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:52.1595779Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:52.1633974Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:52.2220146Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:52.2221123Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:52.5689852Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:23:52.5721787Z test_st_base_init_from_local_shards_and_global_metadata (__main__.TestShardedTensorFromLocalShards) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54395 2022-09-27T16:23:52.5727997Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54396 2022-09-27T16:23:52.5734856Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54397 2022-09-27T16:23:52.5742702Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54398 2022-09-27T16:23:54.1929091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:54.1929607Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:54.1930179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:54.1930646Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:54.1999616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:54.2000094Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:54.2003409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:54.2003915Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:54.2427448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:54.2427927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:54.2430073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:54.2430551Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:54.2859900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:54.2860687Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:54.2861765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:54.2862237Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:54.4428817Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:54.4537511Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:54.4817654Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:54.5324616Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:54.8797432Z skip: Need at least 4 CUDA devices (2.311s) 2022-09-27T16:23:54.8817744Z test_init_from_local_tensor (__main__.TestShardedTensorFromLocalTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54531 2022-09-27T16:23:54.8823942Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54532 2022-09-27T16:23:54.8830567Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54533 2022-09-27T16:23:54.8838060Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54534 2022-09-27T16:23:56.5074596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:56.5075351Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:56.5076254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:56.5076737Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:56.5123087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:56.5123560Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:56.5127166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:56.5127646Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:56.5243567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:56.5244028Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:56.5246999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:56.5247458Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:56.5429376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:56.5429841Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:56.5433674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:56.5434366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:56.7634845Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:56.7794888Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:56.7828834Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:56.7962091Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:57.1894708Z skip: Need at least 4 CUDA devices (2.309s) 2022-09-27T16:23:57.1918686Z test_init_from_local_tensor_errors (__main__.TestShardedTensorFromLocalTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54667 2022-09-27T16:23:57.1924728Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54668 2022-09-27T16:23:57.1931350Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54669 2022-09-27T16:23:57.1937598Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54670 2022-09-27T16:23:58.8256294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:58.8256810Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:58.8258100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:58.8258655Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:58.8479677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:58.8480169Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:58.8483226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:58.8483704Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:58.8940134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:58.8940634Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:58.8941844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:58.8942301Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:58.9260885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:23:58.9261393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:23:58.9262021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:23:58.9262484Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:23:59.1000672Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:23:59.1015694Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:23:59.1401678Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:23:59.1718403Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:23:59.5993278Z skip: Need at least 4 CUDA devices (2.410s) 2022-09-27T16:23:59.6523354Z test_serialize_and_deserialize (__main__.TestShardedTensorMetadata) ... ok (0.053s) 2022-09-27T16:23:59.6523668Z 2022-09-27T16:23:59.6524056Z ---------------------------------------------------------------------- 2022-09-27T16:23:59.6524397Z Ran 64 tests in 144.734s 2022-09-27T16:23:59.6526285Z 2022-09-27T16:23:59.6526800Z OK (skipped=62) 2022-09-27T16:23:59.6527144Z 2022-09-27T16:23:59.6527813Z Generating XML reports... 2022-09-27T16:23:59.6569925Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestCreateTensorFromParams-20220927162134.xml 2022-09-27T16:23:59.6572959Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorMetadata-20220927162134.xml 2022-09-27T16:23:59.6577655Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestLocalTensor-20220927162134.xml 2022-09-27T16:23:59.6582330Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestModuleHookApi-20220927162134.xml 2022-09-27T16:23:59.6587068Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardMetadata-20220927162134.xml 2022-09-27T16:23:59.6592344Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardParameter-20220927162134.xml 2022-09-27T16:23:59.6598095Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardTensor-20220927162134.xml 2022-09-27T16:23:59.6628604Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorChunked-20220927162134.xml 2022-09-27T16:23:59.6635042Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorCustomOps-20220927162134.xml 2022-09-27T16:23:59.6654636Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorEnumerable-20220927162134.xml 2022-09-27T16:23:59.6670654Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorFromLocalShards-20220927162134.xml 2022-09-27T16:23:59.6675742Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorFromLocalTensor-20220927162134.xml 2022-09-27T16:24:00.0271464Z Running distributed/test_c10d_pypg ... [2022-09-27 16:24:00.026514] 2022-09-27T16:24:00.0272326Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_pypg.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:24:00.026592] 2022-09-27T16:24:01.9141732Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_pypg 2022-09-27T16:24:01.9162839Z 2022-09-27T16:24:01.9163199Z Running tests... 2022-09-27T16:24:01.9163630Z ---------------------------------------------------------------------- 2022-09-27T16:24:01.9171024Z test_ddp_checkpointing_dynamic_module (__main__.TestDDPWithWorkSubclass) 2022-09-27T16:24:03.4422616Z Dynamic module can be checkpointed, multiple times, with non-reentrant ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:24:03.4621463Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54838 2022-09-27T16:24:05.0546863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:24:05.0547367Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:24:05.0549450Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:24:05.0549934Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:24:05.2976631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:24:06.5345414Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr1bjmj70 2022-09-27T16:24:06.5346295Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr1bjmj70/_remote_module_non_scriptable.py 2022-09-27T16:24:07.3697226Z ok (5.453s) 2022-09-27T16:24:07.3703116Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.TestDDPWithWorkSubclass) 2022-09-27T16:24:07.3733904Z Dynamic module can be checkpointed multiple times with weight sharing ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54875 2022-09-27T16:24:08.9941405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:24:08.9941927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:24:08.9943307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:24:08.9943772Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:24:09.2412429Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:24:10.4949287Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7s3y31ez 2022-09-27T16:24:10.4949923Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7s3y31ez/_remote_module_non_scriptable.py 2022-09-27T16:24:11.3821893Z ok (4.012s) 2022-09-27T16:24:11.3828402Z test_ddp_checkpointing_once_use_reentrant_False (__main__.TestDDPWithWorkSubclass) 2022-09-27T16:24:11.3855351Z DDP works as expected when layer is checkpointed only once. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54912 2022-09-27T16:24:13.0048193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:24:13.0048733Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:24:13.0051080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:24:13.0051593Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:24:13.2583375Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:24:14.5330371Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk3p4aw6y 2022-09-27T16:24:14.5331232Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk3p4aw6y/_remote_module_non_scriptable.py 2022-09-27T16:24:14.9803421Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:15.0104446Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:15.0253499Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1772: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-09-27T16:24:15.0254265Z warnings.warn( 2022-09-27T16:24:15.0361318Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:15.0566478Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:15.0855420Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:15.1103920Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:15.4931895Z ok (4.111s) 2022-09-27T16:24:15.4938503Z test_ddp_checkpointing_once_use_reentrant_True (__main__.TestDDPWithWorkSubclass) 2022-09-27T16:24:15.4965738Z DDP works as expected when layer is checkpointed only once. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54949 2022-09-27T16:24:17.1115255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:24:17.1115767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:24:17.1118399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:24:17.1118890Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:24:17.3559807Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:24:18.5996763Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxs6ou5ms 2022-09-27T16:24:18.5997598Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxs6ou5ms/_remote_module_non_scriptable.py 2022-09-27T16:24:19.0377911Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:19.0670212Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:19.0815549Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1772: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-09-27T16:24:19.0816386Z warnings.warn( 2022-09-27T16:24:19.0918844Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:19.1117017Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:19.1397793Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:19.1638110Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:19.6041671Z ok (4.111s) 2022-09-27T16:24:19.6046436Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.TestDDPWithWorkSubclass) 2022-09-27T16:24:19.6071957Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54986 2022-09-27T16:24:21.1909444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:24:21.1909959Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:24:21.1912846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:24:21.1913335Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:24:21.4446627Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:24:22.7268572Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp4zlsy9j 2022-09-27T16:24:22.7269789Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp4zlsy9j/_remote_module_non_scriptable.py 2022-09-27T16:24:23.1669485Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:23.1973565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:23.6163791Z ok (4.012s) 2022-09-27T16:24:23.6168678Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.TestDDPWithWorkSubclass) 2022-09-27T16:24:23.6195661Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55023 2022-09-27T16:24:25.2382694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:24:25.2383199Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:24:25.2385645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:24:25.2386151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:24:25.4918400Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:24:26.7592249Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpswl7tlwm 2022-09-27T16:24:26.7592863Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpswl7tlwm/_remote_module_non_scriptable.py 2022-09-27T16:24:27.1963502Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:27.2272525Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:27.6270558Z ok (4.011s) 2022-09-27T16:24:27.6279401Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.TestDDPWithWorkSubclass) 2022-09-27T16:24:27.6304733Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55060 2022-09-27T16:24:29.1990151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:24:29.1990676Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:24:29.1992111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:24:29.1992586Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:24:29.4403588Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:24:30.6819262Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzk0tndz0 2022-09-27T16:24:30.6820116Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzk0tndz0/_remote_module_non_scriptable.py 2022-09-27T16:24:31.1141395Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:31.1380896Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-09-27T16:24:31.1731177Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:31.6380777Z ok (4.011s) 2022-09-27T16:24:31.6389040Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.TestDDPWithWorkSubclass) 2022-09-27T16:24:31.6415982Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55097 2022-09-27T16:24:33.2523637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:24:33.2524166Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:24:33.2526550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:24:33.2527017Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:24:33.5005270Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:24:34.7732077Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyhh2f0c5 2022-09-27T16:24:34.7733030Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyhh2f0c5/_remote_module_non_scriptable.py 2022-09-27T16:24:35.6488679Z ok (4.011s) 2022-09-27T16:24:35.6493210Z test_ddp_checkpointing_twice_weight_sharing (__main__.TestDDPWithWorkSubclass) 2022-09-27T16:24:35.6520512Z Checkpointing should work with static graph in the case of checkpointing ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55134 2022-09-27T16:24:37.2538598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:24:37.2539137Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:24:37.2540323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:24:37.2540799Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:24:37.5051852Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:24:38.7840694Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpelyp5z7s 2022-09-27T16:24:38.7841527Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpelyp5z7s/_remote_module_non_scriptable.py 2022-09-27T16:24:39.2365734Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:39.2653109Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:39.6597655Z ok (4.011s) 2022-09-27T16:24:39.6605053Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.TestDDPWithWorkSubclass) 2022-09-27T16:24:39.6629446Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55171 2022-09-27T16:24:41.2845828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:24:41.2846326Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:24:41.2848350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:24:41.2848834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:24:41.5391820Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:24:42.8093094Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpczy190ym 2022-09-27T16:24:42.8094007Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpczy190ym/_remote_module_non_scriptable.py 2022-09-27T16:24:43.2337354Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-09-27T16:24:43.2603186Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1772: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-09-27T16:24:43.2603935Z warnings.warn( 2022-09-27T16:24:43.2710679Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:43.3214425Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:43.7706345Z ok (4.111s) 2022-09-27T16:24:43.7713462Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.TestDDPWithWorkSubclass) 2022-09-27T16:24:43.7739036Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55208 2022-09-27T16:24:45.3881724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:24:45.3882231Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:24:45.3884831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:24:45.3885337Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:24:45.6372833Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:24:46.9008955Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpig9r39z7 2022-09-27T16:24:46.9010229Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpig9r39z7/_remote_module_non_scriptable.py 2022-09-27T16:24:47.3410856Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1772: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-09-27T16:24:47.3411943Z warnings.warn( 2022-09-27T16:24:47.3537023Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:47.3931498Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:47.7819962Z ok (4.011s) 2022-09-27T16:24:47.7828527Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.TestDDPWithWorkSubclass) 2022-09-27T16:24:47.7851599Z Test that checkpointing with weight sharing works. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55245 2022-09-27T16:24:49.4134994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:24:49.4135534Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:24:49.4137498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:24:49.4137980Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:24:49.6623140Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:24:50.9223486Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjgksq__2 2022-09-27T16:24:50.9224576Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjgksq__2/_remote_module_non_scriptable.py 2022-09-27T16:24:51.3432240Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:51.3770366Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:51.7924998Z ok (4.010s) 2022-09-27T16:24:51.7933379Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.TestDDPWithWorkSubclass) 2022-09-27T16:24:51.7956612Z Test that checkpointing with weight sharing works. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55282 2022-09-27T16:24:53.3726703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:24:53.3727213Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:24:53.3728012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:24:53.3728496Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:24:53.6152853Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:24:54.8633855Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplgyuc_nv 2022-09-27T16:24:54.8634861Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplgyuc_nv/_remote_module_non_scriptable.py 2022-09-27T16:24:55.2981347Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:55.3274313Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:55.3472020Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:55.3754624Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:24:55.8046554Z ok (4.012s) 2022-09-27T16:24:55.8080856Z test_ddp_invoke_work_object (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55319 2022-09-27T16:24:57.4277515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:24:57.4278031Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:24:57.4280314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:24:57.4281128Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:24:57.6775237Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:24:57.6891480Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu1b6kl0l 2022-09-27T16:24:57.6894583Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu1b6kl0l/_remote_module_non_scriptable.py 2022-09-27T16:24:58.0122983Z ok (2.208s) 2022-09-27T16:24:58.0279690Z test_ddp_with_pypg (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55355 2022-09-27T16:24:59.6292428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:24:59.6292947Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:24:59.6295309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:24:59.6295804Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:24:59.8818821Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:24:59.8937841Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu2jv243h 2022-09-27T16:24:59.8940906Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu2jv243h/_remote_module_non_scriptable.py 2022-09-27T16:24:59.9148282Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:00.2323773Z ok (2.220s) 2022-09-27T16:25:00.2352382Z test_ddp_with_pypg_with_grad_views (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55391 2022-09-27T16:25:01.8295965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:25:01.8296475Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:25:01.8298582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:25:01.8299065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:25:02.0776825Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:25:02.0889818Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8_9130um 2022-09-27T16:25:02.0892257Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8_9130um/_remote_module_non_scriptable.py 2022-09-27T16:25:02.1088191Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:02.4397173Z ok (2.207s) 2022-09-27T16:25:02.4429399Z test_invalid_powerSGD_state (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55427 2022-09-27T16:25:04.0435620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:25:04.0436175Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:25:04.0438304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:25:04.0438783Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:25:04.2855311Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:25:04.2859994Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-09-27T16:25:04.2861474Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-09-27T16:25:04.2862544Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-09-27T16:25:04.2863609Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-09-27T16:25:04.2864674Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-09-27T16:25:04.2865720Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-09-27T16:25:04.5473542Z ok (2.107s) 2022-09-27T16:25:04.5508081Z test_sync_batch_norm_empty_input (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55461 2022-09-27T16:25:06.1382289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:25:06.1382795Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:25:06.1385256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:25:06.1385751Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:25:06.3876229Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:25:07.6672779Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr8asj25t 2022-09-27T16:25:07.6673808Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr8asj25t/_remote_module_non_scriptable.py 2022-09-27T16:25:07.6870652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:08.0574050Z ok (3.510s) 2022-09-27T16:25:08.0609626Z test_sync_batch_norm_only_empty_input (__main__.TestDDPWithWorkSubclass) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55498 2022-09-27T16:25:09.6723710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:25:09.6724213Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:25:09.6725678Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:25:09.6726221Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:25:09.9205519Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:25:11.1751192Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpulzs9k6k 2022-09-27T16:25:11.1752826Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpulzs9k6k/_remote_module_non_scriptable.py 2022-09-27T16:25:11.1945001Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:11.5676480Z ok (3.510s) 2022-09-27T16:25:11.5682176Z test_ddp_checkpointing_dynamic_module (__main__.TestDDPWithWorkWrapper) 2022-09-27T16:25:11.5709952Z Dynamic module can be checkpointed, multiple times, with non-reentrant ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55535 2022-09-27T16:25:13.1361066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:25:13.1362047Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:25:13.1363364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:25:13.1363884Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:25:13.3875862Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:25:14.6646139Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpibw7qw0g 2022-09-27T16:25:14.6647257Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpibw7qw0g/_remote_module_non_scriptable.py 2022-09-27T16:25:15.5785958Z ok (4.011s) 2022-09-27T16:25:15.5790377Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.TestDDPWithWorkWrapper) 2022-09-27T16:25:15.5819093Z Dynamic module can be checkpointed multiple times with weight sharing ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55572 2022-09-27T16:25:17.1785024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:25:17.1785524Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:25:17.1787895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:25:17.1788407Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:25:17.4280230Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:25:18.6923047Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkut7mqss 2022-09-27T16:25:18.6924077Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkut7mqss/_remote_module_non_scriptable.py 2022-09-27T16:25:19.4893907Z ok (3.911s) 2022-09-27T16:25:19.4900086Z test_ddp_checkpointing_once_use_reentrant_False (__main__.TestDDPWithWorkWrapper) 2022-09-27T16:25:19.4927724Z DDP works as expected when layer is checkpointed only once. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55609 2022-09-27T16:25:21.1013501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:25:21.1013984Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:25:21.1015512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:25:21.1016002Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:25:21.3540876Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:25:22.6152338Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9g9kwihk 2022-09-27T16:25:22.6153418Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9g9kwihk/_remote_module_non_scriptable.py 2022-09-27T16:25:23.0578638Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:23.0877531Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:23.1025572Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1772: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-09-27T16:25:23.1026653Z warnings.warn( 2022-09-27T16:25:23.1133036Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:23.1339339Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:23.1627282Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:23.1875050Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:23.6005905Z ok (4.111s) 2022-09-27T16:25:23.6012132Z test_ddp_checkpointing_once_use_reentrant_True (__main__.TestDDPWithWorkWrapper) 2022-09-27T16:25:23.6043518Z DDP works as expected when layer is checkpointed only once. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55646 2022-09-27T16:25:25.1742877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:25:25.1743384Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:25:25.1744422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:25:25.1744880Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:25:25.4179808Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:25:26.6533430Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgrnhd3c7 2022-09-27T16:25:26.6534396Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgrnhd3c7/_remote_module_non_scriptable.py 2022-09-27T16:25:27.0881785Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:27.1172835Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:27.1316878Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1772: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-09-27T16:25:27.1317627Z warnings.warn( 2022-09-27T16:25:27.1418799Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:27.1615425Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:27.1896648Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:27.2136918Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:27.6121597Z ok (4.011s) 2022-09-27T16:25:27.6127011Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.TestDDPWithWorkWrapper) 2022-09-27T16:25:27.6155439Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55683 2022-09-27T16:25:29.2229253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:25:29.2229732Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:25:29.2231690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:25:29.2232173Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:25:29.4655927Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:25:30.7041139Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwnn914oj 2022-09-27T16:25:30.7041761Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwnn914oj/_remote_module_non_scriptable.py 2022-09-27T16:25:31.1346409Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:31.1717253Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:31.5225925Z ok (3.910s) 2022-09-27T16:25:31.5230788Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.TestDDPWithWorkWrapper) 2022-09-27T16:25:31.5256588Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55720 2022-09-27T16:25:33.1140574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:25:33.1141099Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:25:33.1142860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:25:33.1143325Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:25:33.3626145Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:25:34.6072891Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdml_h0_9 2022-09-27T16:25:34.6074805Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdml_h0_9/_remote_module_non_scriptable.py 2022-09-27T16:25:35.0331288Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:35.0632764Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:35.4541191Z ok (3.931s) 2022-09-27T16:25:35.4548287Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.TestDDPWithWorkWrapper) 2022-09-27T16:25:35.4573932Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55757 2022-09-27T16:25:37.0705936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:25:37.0706429Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:25:37.0708750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:25:37.0709230Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:25:37.3228257Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:25:38.5842890Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcmkd8djq 2022-09-27T16:25:38.5843757Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcmkd8djq/_remote_module_non_scriptable.py 2022-09-27T16:25:39.0276954Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:39.0526129Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-09-27T16:25:39.0889736Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:39.5648882Z ok (4.111s) 2022-09-27T16:25:39.5656268Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.TestDDPWithWorkWrapper) 2022-09-27T16:25:39.5685650Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55794 2022-09-27T16:25:41.1853732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:25:41.1854255Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:25:41.1855664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:25:41.1856149Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:25:41.4406697Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:25:42.7031979Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu_03s18z 2022-09-27T16:25:42.7032968Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu_03s18z/_remote_module_non_scriptable.py 2022-09-27T16:25:43.5760863Z ok (4.011s) 2022-09-27T16:25:43.5765295Z test_ddp_checkpointing_twice_weight_sharing (__main__.TestDDPWithWorkWrapper) 2022-09-27T16:25:43.5790196Z Checkpointing should work with static graph in the case of checkpointing ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55831 2022-09-27T16:25:45.1887623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:25:45.1888136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:25:45.1889369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:25:45.1889844Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:25:45.4381969Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:25:46.7145092Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp877qwf1q 2022-09-27T16:25:46.7145710Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp877qwf1q/_remote_module_non_scriptable.py 2022-09-27T16:25:47.1667993Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:47.1958191Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:47.5866866Z ok (4.010s) 2022-09-27T16:25:47.5874402Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.TestDDPWithWorkWrapper) 2022-09-27T16:25:47.5906807Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55868 2022-09-27T16:25:49.1858824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:25:49.1859839Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:25:49.1861031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:25:49.1862350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:25:49.4334159Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:25:50.6895582Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpae_qwa98 2022-09-27T16:25:50.6896729Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpae_qwa98/_remote_module_non_scriptable.py 2022-09-27T16:25:51.1070475Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-09-27T16:25:51.1327936Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1772: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-09-27T16:25:51.1329380Z warnings.warn( 2022-09-27T16:25:51.1431583Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:51.1921891Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:51.5981346Z ok (4.011s) 2022-09-27T16:25:51.5988802Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.TestDDPWithWorkWrapper) 2022-09-27T16:25:51.6015521Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55905 2022-09-27T16:25:53.2125400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:25:53.2125887Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:25:53.2126973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:25:53.2127441Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:25:53.4627998Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:25:54.7362928Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg4hty8hw 2022-09-27T16:25:54.7363523Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg4hty8hw/_remote_module_non_scriptable.py 2022-09-27T16:25:55.1756023Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1772: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-09-27T16:25:55.1756816Z warnings.warn( 2022-09-27T16:25:55.1881815Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:55.2278302Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:55.6090741Z ok (4.011s) 2022-09-27T16:25:55.6099307Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.TestDDPWithWorkWrapper) 2022-09-27T16:25:55.6128700Z Test that checkpointing with weight sharing works. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55942 2022-09-27T16:25:57.1865451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:25:57.1866220Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:25:57.1867127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:25:57.1878841Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:25:57.4425000Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:25:58.7096677Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp99ka2ic_ 2022-09-27T16:25:58.7097289Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp99ka2ic_/_remote_module_non_scriptable.py 2022-09-27T16:25:59.1530305Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:59.1878698Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:25:59.6218605Z ok (4.013s) 2022-09-27T16:25:59.6227555Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.TestDDPWithWorkWrapper) 2022-09-27T16:25:59.6295521Z Test that checkpointing with weight sharing works. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55979 2022-09-27T16:26:01.2802060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:01.2802548Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:01.2803382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:01.2803861Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:01.5345347Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:26:02.8147495Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1q5q6dyi 2022-09-27T16:26:02.8148358Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1q5q6dyi/_remote_module_non_scriptable.py 2022-09-27T16:26:03.2579479Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:26:03.2870318Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:26:03.3066178Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:26:03.3347220Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:26:03.7377935Z ok (4.116s) 2022-09-27T16:26:03.7412323Z test_ddp_invoke_work_object (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56016 2022-09-27T16:26:05.3106019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:05.3106551Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:05.3107665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:05.3108143Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:05.5523274Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:26:05.5633798Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpatu26uzj 2022-09-27T16:26:05.5636209Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpatu26uzj/_remote_module_non_scriptable.py 2022-09-27T16:26:05.8453140Z ok (2.107s) 2022-09-27T16:26:05.8481529Z test_ddp_with_pypg (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56052 2022-09-27T16:26:07.4310612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:07.4311711Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:07.4312353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:07.4312837Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:07.6736522Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:26:07.6849466Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6rh1mdge 2022-09-27T16:26:07.6851907Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6rh1mdge/_remote_module_non_scriptable.py 2022-09-27T16:26:07.7051468Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:26:07.9521463Z ok (2.107s) 2022-09-27T16:26:07.9548033Z test_ddp_with_pypg_with_grad_views (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56088 2022-09-27T16:26:09.5603770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:09.5604251Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:09.5605460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:09.5605942Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:09.8021184Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:26:09.8134227Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpde19svix 2022-09-27T16:26:09.8137044Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpde19svix/_remote_module_non_scriptable.py 2022-09-27T16:26:09.8332702Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:26:10.1592895Z ok (2.207s) 2022-09-27T16:26:10.1625224Z test_invalid_powerSGD_state (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56124 2022-09-27T16:26:11.7072347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:11.7072864Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:11.7075264Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:11.7075766Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:11.9537822Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:26:11.9542946Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-09-27T16:26:11.9544409Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-09-27T16:26:11.9545652Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-09-27T16:26:11.9547202Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-09-27T16:26:11.9548286Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-09-27T16:26:11.9549331Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-09-27T16:26:12.2667755Z ok (2.107s) 2022-09-27T16:26:12.2704587Z test_sync_batch_norm_empty_input (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56158 2022-09-27T16:26:13.8335704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:13.8336216Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:13.8337081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:13.8337565Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:14.0781749Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:26:15.3339692Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu4yt340o 2022-09-27T16:26:15.3340323Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu4yt340o/_remote_module_non_scriptable.py 2022-09-27T16:26:15.3527224Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:26:15.6770435Z ok (3.410s) 2022-09-27T16:26:15.6808826Z test_sync_batch_norm_only_empty_input (__main__.TestDDPWithWorkWrapper) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56195 2022-09-27T16:26:17.2649840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:17.2650319Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:17.2652249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:17.2652717Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:17.5074134Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:26:18.7377747Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkrnc7xd1 2022-09-27T16:26:18.7378347Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkrnc7xd1/_remote_module_non_scriptable.py 2022-09-27T16:26:18.7552682Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:26:19.0873442Z ok (3.410s) 2022-09-27T16:26:19.0873651Z 2022-09-27T16:26:19.0874026Z ---------------------------------------------------------------------- 2022-09-27T16:26:19.0875882Z Ran 38 tests in 137.171s 2022-09-27T16:26:19.0876216Z 2022-09-27T16:26:19.0876390Z OK 2022-09-27T16:26:19.0876545Z 2022-09-27T16:26:19.0876692Z Generating XML reports... 2022-09-27T16:26:19.0932812Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_pypg/TEST-TestDDPWithWorkSubclass-20220927162401.xml 2022-09-27T16:26:19.0956315Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_pypg/TEST-TestDDPWithWorkWrapper-20220927162401.xml 2022-09-27T16:26:19.4660066Z Running distributed/fsdp/test_wrap ... [2022-09-27 16:26:19.465487] 2022-09-27T16:26:19.4661040Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_wrap.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:26:19.465568] 2022-09-27T16:26:21.3307111Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_wrap 2022-09-27T16:26:21.3328787Z 2022-09-27T16:26:21.3329075Z Running tests... 2022-09-27T16:26:21.3329512Z ---------------------------------------------------------------------- 2022-09-27T16:26:21.3334673Z test_always_wrap (__main__.TestAutoWrap) 2022-09-27T16:26:22.8071658Z Test to ensure that if `always_wrap_policy` is ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:26:22.8288413Z ok (1.496s) 2022-09-27T16:26:22.8321696Z test_always_wrap_with_ignored_modules_wrap_method_WrapMethod_FSDP_CTOR (__main__.TestAutoWrap) ... /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:26:22.8323006Z warnings.warn( 2022-09-27T16:26:22.8342249Z ok (0.005s) 2022-09-27T16:26:22.8387538Z test_always_wrap_with_ignored_modules_wrap_method_WrapMethod_WRAP_API (__main__.TestAutoWrap) ... ok (0.004s) 2022-09-27T16:26:22.8394113Z test_auto_wrap_api (__main__.TestAutoWrap) 2022-09-27T16:26:22.8417002Z Test to ensure with auto wrap, we wrap child modules correctly based on the min_num_params. ... ok (0.003s) 2022-09-27T16:26:22.8424706Z test_auto_wrap_preset_exclude_wrap (__main__.TestAutoWrap) 2022-09-27T16:26:22.8439004Z Test to ensure excluded modules are not wrapped, regardless if the total param size is greater than the ... ok (0.002s) 2022-09-27T16:26:22.8445444Z test_auto_wrap_preset_exclude_wrap_include_children (__main__.TestAutoWrap) 2022-09-27T16:26:22.8458590Z Test to ensure excluded modules are not wrapped, but children are if param size is greater than ... [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:22.8459948Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:22.8461177Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:22.8462616Z ok (0.002s) 2022-09-27T16:26:22.8471102Z test_auto_wrap_preset_force_leaf (__main__.TestAutoWrap) 2022-09-27T16:26:22.8496758Z Test to ensure force-leaf modules are not wrapped, and children are not wrapped. The ... ok (0.003s) 2022-09-27T16:26:22.8505319Z test_auto_wrap_preset_force_leaf_custom (__main__.TestAutoWrap) 2022-09-27T16:26:22.8519889Z Test to ensure force-leaf modules are not wrapped. ... ok (0.002s) 2022-09-27T16:26:22.8560119Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_AFTER_cpu_offload_CPUOffload(offload_params=False)_use_device_id_False (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:26:22.8560988Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-09-27T16:26:22.8584122Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:22.8585381Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:22.8586597Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:23.3182877Z ok (0.466s) 2022-09-27T16:26:23.3221667Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_AFTER_cpu_offload_CPUOffload(offload_params=False)_use_device_id_True (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:26:23.3222837Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-09-27T16:26:23.3620317Z ok (0.044s) 2022-09-27T16:26:23.3641067Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_AFTER_cpu_offload_CPUOffload(offload_params=True)_use_device_id_False (__main__.TestAutoWrap) ... ok (0.002s) 2022-09-27T16:26:23.3661236Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_AFTER_cpu_offload_CPUOffload(offload_params=True)_use_device_id_True (__main__.TestAutoWrap) ... ok (0.002s) 2022-09-27T16:26:23.3698965Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_BEFORE_cpu_offload_CPUOffload(offload_params=False)_use_device_id_False (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:26:23.3699916Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-09-27T16:26:23.3787774Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:23.3789018Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:23.4124724Z ok (0.046s) 2022-09-27T16:26:23.4160317Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_BEFORE_cpu_offload_CPUOffload(offload_params=False)_use_device_id_True (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:26:23.4161237Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-09-27T16:26:23.4548868Z ok (0.042s) 2022-09-27T16:26:23.4588707Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_BEFORE_cpu_offload_CPUOffload(offload_params=True)_use_device_id_False (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:26:23.4589610Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-09-27T16:26:23.5005916Z ok (0.046s) 2022-09-27T16:26:23.5043664Z test_auto_wrap_smoke_test_cuda_init_mode_CUDAInitMode_CUDA_BEFORE_cpu_offload_CPUOffload(offload_params=True)_use_device_id_True (__main__.TestAutoWrap) ... INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:26:23.5044898Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-09-27T16:26:23.5114019Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:23.5115269Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:23.5228935Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:23.5230165Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:23.5601651Z ok (0.059s) 2022-09-27T16:26:23.5631961Z test_auto_wrap_with_ignored_modules_wrap_method_WrapMethod_FSDP_CTOR (__main__.TestAutoWrap) ... ok (0.003s) 2022-09-27T16:26:23.5661315Z test_auto_wrap_with_ignored_modules_wrap_method_WrapMethod_WRAP_API (__main__.TestAutoWrap) ... ok (0.003s) 2022-09-27T16:26:23.5669906Z test_transformer_auto_wrap_policy (__main__.TestAutoWrap) 2022-09-27T16:26:23.5686475Z Tests the ``transformer_auto_wrap_policy``. ... [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:23.5687785Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:23.5853188Z ok (0.019s) 2022-09-27T16:26:23.5872203Z test_wrap_disabled_outside_context (__main__.TestAutoWrap) ... [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:23.5874346Z ok (0.002s) 2022-09-27T16:26:23.5896546Z test_wrap_override_defaults (__main__.TestAutoWrap) ... ok (0.002s) 2022-09-27T16:26:23.5916831Z test_wrap_wrap_method_WrapMethod_FSDP_CTOR (__main__.TestAutoWrap) ... ok (0.002s) 2022-09-27T16:26:23.5936509Z test_wrap_wrap_method_WrapMethod_WRAP_API (__main__.TestAutoWrap) ... ok (0.002s) 2022-09-27T16:26:23.5948377Z test_bn_always_wrapped_individually (__main__.TestFSDPWrap) 2022-09-27T16:26:23.5991693Z Ensures that by using _or_policy with _wrap_batchnorm_individually, even ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56305 2022-09-27T16:26:23.5998803Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56306 2022-09-27T16:26:25.2475224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:25.2475766Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:25.2476582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:25.2477032Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:25.2477598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:25.2478066Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:25.2480504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:25.2481164Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:25.5124534Z dist init r=1, world=2 2022-09-27T16:26:25.5128335Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:26:25.5132928Z dist init r=0, world=2 2022-09-27T16:26:25.5137902Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:26:25.5139033Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:25.5231291Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:26.8880950Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:26:26.8881479Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:26:26.9105338Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:26:26.9106128Z warnings.warn( 2022-09-27T16:26:26.9107226Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:26:26.9107974Z warnings.warn( 2022-09-27T16:26:27.4077632Z ok (3.814s) 2022-09-27T16:26:27.4084734Z test_error_already_wrapped_nested_False_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) 2022-09-27T16:26:27.4113641Z Test that an error is raised if we attempt to wrap when submodules are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56386 2022-09-27T16:26:27.4121050Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56387 2022-09-27T16:26:29.0137214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:29.0137721Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:29.0138321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:29.0138782Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:29.0482726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:29.0483191Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:29.0486986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:29.0487478Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:29.2687441Z dist init r=1, world=2 2022-09-27T16:26:29.2690792Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:26:29.2924777Z dist init r=0, world=2 2022-09-27T16:26:29.2930147Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:26:29.2930933Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:29.2997560Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:30.6784593Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:26:30.6785132Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:26:30.7016411Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:26:30.7017203Z warnings.warn( 2022-09-27T16:26:30.7018301Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:26:30.7019053Z warnings.warn( 2022-09-27T16:26:31.1198903Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:31.1201277Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:31.1203915Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:26:31.1205163Z ok (3.712s) 2022-09-27T16:26:31.1211493Z test_error_already_wrapped_nested_False_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) 2022-09-27T16:26:31.1239694Z Test that an error is raised if we attempt to wrap when submodules are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56467 2022-09-27T16:26:31.1246612Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56468 2022-09-27T16:26:32.7577750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:32.7578718Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:32.7579940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:32.7581240Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:32.8346913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:32.8347910Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:32.8349123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:32.8350001Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:33.0113433Z dist init r=1, world=2 2022-09-27T16:26:33.0117478Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:26:33.0731686Z dist init r=0, world=2 2022-09-27T16:26:33.0737677Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:26:33.0739075Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:33.0828496Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:34.4388812Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:26:34.4389802Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:26:34.9325729Z ok (3.812s) 2022-09-27T16:26:34.9331381Z test_error_already_wrapped_nested_True_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) 2022-09-27T16:26:34.9359224Z Test that an error is raised if we attempt to wrap when submodules are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56548 2022-09-27T16:26:34.9366582Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56549 2022-09-27T16:26:36.5822060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:36.5822795Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:36.5824052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:36.5824545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:36.6181024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:36.6181485Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:36.6184641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:36.6185104Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:36.8365217Z dist init r=1, world=2 2022-09-27T16:26:36.8368989Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:26:36.8618205Z dist init r=0, world=2 2022-09-27T16:26:36.8623699Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:26:36.8624805Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:36.8674855Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:38.2491290Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:26:38.2491827Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:26:38.2694989Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:26:38.2695806Z warnings.warn( 2022-09-27T16:26:38.2729499Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:26:38.2730245Z warnings.warn( 2022-09-27T16:26:38.7445341Z ok (3.812s) 2022-09-27T16:26:38.7450736Z test_error_already_wrapped_nested_True_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) 2022-09-27T16:26:38.7477072Z Test that an error is raised if we attempt to wrap when submodules are ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56629 2022-09-27T16:26:38.7484321Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56630 2022-09-27T16:26:40.4032096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:40.4032591Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:40.4034151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:40.4034642Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:40.4059269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:40.4059729Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:40.4062846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:40.4063324Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:40.6420021Z dist init r=1, world=2 2022-09-27T16:26:40.6424153Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:26:40.6584171Z dist init r=0, world=2 2022-09-27T16:26:40.6589082Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:26:40.6589983Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:40.6628053Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:42.0104925Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:26:42.0105511Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:26:42.4561727Z ok (3.712s) 2022-09-27T16:26:42.4610138Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56710 2022-09-27T16:26:42.4616452Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56711 2022-09-27T16:26:44.1263901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:44.1264396Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:44.1265772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:44.1266270Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:44.1269498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:44.1269966Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:44.1273912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:44.1274565Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:44.3612958Z dist init r=0, world=2 2022-09-27T16:26:44.3617221Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:26:44.3887616Z dist init r=1, world=2 2022-09-27T16:26:44.3892608Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:26:44.3893694Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:44.3922090Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:45.7757647Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:26:45.7758179Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:26:45.7978539Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:26:45.7979350Z warnings.warn( 2022-09-27T16:26:45.8014985Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:26:45.8015769Z warnings.warn( 2022-09-27T16:26:46.7707418Z ok (4.314s) 2022-09-27T16:26:46.7757402Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56795 2022-09-27T16:26:46.7765421Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56796 2022-09-27T16:26:48.4427783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:48.4428285Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:48.4429413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:48.4429865Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:48.4709151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:48.4709603Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:48.4713059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:48.4713520Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:48.7054117Z dist init r=0, world=2 2022-09-27T16:26:48.7058494Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:26:48.7173295Z dist init r=1, world=2 2022-09-27T16:26:48.7178724Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:26:48.7179557Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:48.7262537Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:50.1163361Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:26:50.1163878Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:26:51.0857542Z ok (4.315s) 2022-09-27T16:26:51.0906041Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56880 2022-09-27T16:26:51.0912903Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56881 2022-09-27T16:26:52.7445319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:52.7445830Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:52.7447133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:52.7447597Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:52.7545404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:52.7545866Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:52.7548638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:52.7549126Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:52.9992834Z dist init r=0, world=2 2022-09-27T16:26:52.9997374Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:26:53.0012708Z dist init r=1, world=2 2022-09-27T16:26:53.0017776Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:26:53.0018993Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:53.0099978Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:54.3800961Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:26:54.3801504Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:26:54.4018569Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:26:54.4019446Z warnings.warn( 2022-09-27T16:26:54.4022309Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:26:54.4023126Z warnings.warn( 2022-09-27T16:26:55.4004131Z ok (4.315s) 2022-09-27T16:26:55.4073663Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56965 2022-09-27T16:26:55.4080293Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56966 2022-09-27T16:26:57.0704799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:57.0705278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:57.0707317Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:57.0707801Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:57.1151737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:26:57.1152267Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:26:57.1155307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:26:57.1155787Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:26:57.3217783Z dist init r=1, world=2 2022-09-27T16:26:57.3222031Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:26:57.3557648Z dist init r=0, world=2 2022-09-27T16:26:57.3562219Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:26:57.3563227Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:57.3629183Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:26:58.7392848Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:26:58.7393377Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:26:59.7165565Z ok (4.316s) 2022-09-27T16:26:59.7216375Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57050 2022-09-27T16:26:59.7222559Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57051 2022-09-27T16:27:01.4098985Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:01.4099504Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:01.4100746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:01.4101250Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:01.4213323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:01.4214131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:01.4216780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:01.4217581Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:01.6717458Z dist init r=0, world=2 2022-09-27T16:27:01.6721423Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:27:01.7145368Z dist init r=1, world=2 2022-09-27T16:27:01.7149511Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:27:01.7151261Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:01.7230437Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:03.1067106Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:27:03.1067946Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:27:03.1338825Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:27:03.1340004Z warnings.warn( 2022-09-27T16:27:03.1341156Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:27:03.1341897Z warnings.warn( 2022-09-27T16:27:04.1311034Z ok (4.414s) 2022-09-27T16:27:04.1363432Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57135 2022-09-27T16:27:04.1369269Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57136 2022-09-27T16:27:05.8389013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:05.8389515Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:05.8390325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:05.8390988Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:05.8539290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:05.8539725Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:05.8542895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:05.8543394Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:06.0991640Z dist init r=0, world=2 2022-09-27T16:27:06.0995825Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:27:06.1047948Z dist init r=1, world=2 2022-09-27T16:27:06.1053188Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:27:06.1055044Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:06.1098845Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:07.4798990Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:27:07.4799953Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:27:08.4456716Z ok (4.314s) 2022-09-27T16:27:08.4507188Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57220 2022-09-27T16:27:08.4513817Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57221 2022-09-27T16:27:10.0617779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:10.0618291Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:10.0619375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:10.0619860Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:10.1066687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:10.1067378Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:10.1070149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:10.1070638Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:10.3140518Z dist init r=1, world=2 2022-09-27T16:27:10.3144445Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:27:10.3479384Z dist init r=0, world=2 2022-09-27T16:27:10.3484176Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:27:10.3485201Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:10.3551195Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:11.7354761Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:27:11.7355323Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:27:11.7578798Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:27:11.7579593Z warnings.warn( 2022-09-27T16:27:11.7580728Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:27:11.7581468Z warnings.warn( 2022-09-27T16:27:12.6598003Z ok (4.214s) 2022-09-27T16:27:12.6646328Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=False)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57305 2022-09-27T16:27:12.6652818Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57306 2022-09-27T16:27:14.3525624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:14.3526136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:14.3527233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:14.3527704Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:14.3671147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:14.3671929Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:14.3674969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:14.3675436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:14.6151360Z dist init r=1, world=2 2022-09-27T16:27:14.6155667Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:27:14.6198882Z dist init r=0, world=2 2022-09-27T16:27:14.6203856Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:27:14.6205110Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:14.6258539Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:15.9666864Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:27:15.9667818Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:27:16.8737319Z ok (4.214s) 2022-09-27T16:27:16.8794721Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57390 2022-09-27T16:27:16.8801485Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57391 2022-09-27T16:27:18.5673605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:18.5674164Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:18.5675561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:18.5676047Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:18.5699735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:18.5700323Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:18.5702853Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:18.5703338Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:18.8057387Z dist init r=0, world=2 2022-09-27T16:27:18.8060884Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:27:18.8311803Z dist init r=1, world=2 2022-09-27T16:27:18.8317486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:27:18.8318284Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:18.8367331Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:20.2044142Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:27:20.2044664Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:27:20.6877196Z ok (3.814s) 2022-09-27T16:27:20.6914419Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57471 2022-09-27T16:27:20.6920768Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57472 2022-09-27T16:27:22.3137602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:22.3138116Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:22.3138927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:22.3139388Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:22.3280612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:22.3281070Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:22.3284229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:22.3284707Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:22.5708976Z dist init r=1, world=2 2022-09-27T16:27:22.5713435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:27:22.5714780Z dist init r=0, world=2 2022-09-27T16:27:22.5720420Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:27:22.5721215Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:22.5816488Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:23.9466335Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:27:23.9466880Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:27:24.9005360Z ok (4.213s) 2022-09-27T16:27:24.9053210Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57556 2022-09-27T16:27:24.9059449Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57557 2022-09-27T16:27:26.5741625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:26.5742132Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:26.5743831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:26.5744337Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:26.5910550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:26.5911298Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:26.5914190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:26.5914671Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:26.8346282Z dist init r=0, world=2 2022-09-27T16:27:26.8349563Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:27:26.8442855Z dist init r=1, world=2 2022-09-27T16:27:26.8448061Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:27:26.8448821Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:26.8452277Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:28.2028793Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:27:28.2029363Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:27:28.6133205Z ok (3.713s) 2022-09-27T16:27:28.6180880Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_POST_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57637 2022-09-27T16:27:28.6188438Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57638 2022-09-27T16:27:30.2560566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:30.2561563Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:30.2562699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:30.2563196Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:30.2755774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:30.2756244Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:30.2759365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:30.2759856Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:30.5176544Z dist init r=0, world=2 2022-09-27T16:27:30.5180261Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:27:30.5246671Z dist init r=1, world=2 2022-09-27T16:27:30.5252125Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:27:30.5253053Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:30.5282982Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:31.9245306Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:27:31.9245814Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:27:32.9276185Z ok (4.314s) 2022-09-27T16:27:32.9329293Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57722 2022-09-27T16:27:32.9336327Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57723 2022-09-27T16:27:34.5998320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:34.5998833Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:34.5999845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:34.6000328Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:34.6155173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:34.6155637Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:34.6158492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:34.6159012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:34.8621101Z dist init r=1, world=2 2022-09-27T16:27:34.8625422Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:27:34.8698569Z dist init r=0, world=2 2022-09-27T16:27:34.8703787Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:27:34.8704706Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:34.8727841Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:36.2632869Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:27:36.2633411Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:27:36.7413291Z ok (3.814s) 2022-09-27T16:27:36.7467900Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_False_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57803 2022-09-27T16:27:36.7474977Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57804 2022-09-27T16:27:38.4079472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:38.4079982Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:38.4081978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:38.4082474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:38.4459820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:38.4460288Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:38.4462844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:38.4463327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:38.6582844Z dist init r=0, world=2 2022-09-27T16:27:38.6586664Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:27:38.6878308Z dist init r=1, world=2 2022-09-27T16:27:38.6883372Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:27:38.6884337Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:38.6892340Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:40.0420155Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:27:40.0420721Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:27:41.0561902Z ok (4.315s) 2022-09-27T16:27:41.0610737Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_AFTER (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57888 2022-09-27T16:27:41.0617206Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57889 2022-09-27T16:27:42.7507742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:42.7508226Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:42.7509062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:42.7509747Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:42.7630131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:42.7630578Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:42.7634790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:42.7635273Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:43.0021752Z dist init r=0, world=2 2022-09-27T16:27:43.0026297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:27:43.0104770Z dist init r=1, world=2 2022-09-27T16:27:43.0111055Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:27:43.0112503Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:43.0130034Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:44.3739106Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:27:44.3739625Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:27:44.8698618Z ok (3.814s) 2022-09-27T16:27:44.8748239Z test_main_wrap_api_cpu_offload_CPUOffload(offload_params=True)_backward_prefetch_BackwardPrefetch_BACKWARD_PRE_forward_prefetch_True_cuda_init_mode_CUDAInitMode_CUDA_BEFORE (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57969 2022-09-27T16:27:44.8755754Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57970 2022-09-27T16:27:46.5704080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:46.5704617Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:46.5705712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:46.5706184Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:46.5973719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:46.5974194Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:46.5977053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:46.5977528Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:46.8276183Z dist init r=1, world=2 2022-09-27T16:27:46.8280329Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:27:46.8332925Z dist init r=0, world=2 2022-09-27T16:27:46.8337828Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:27:46.8339250Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:46.8383331Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:48.2230044Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:27:48.2230560Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:27:49.1854559Z ok (4.316s) 2022-09-27T16:27:49.1887295Z test_wrap_batchnorm_individually_use_or_policy_False (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58054 2022-09-27T16:27:49.1893993Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58055 2022-09-27T16:27:50.8399802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:50.8400357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:50.8401485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:50.8401965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:50.8592419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:50.8592907Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:50.8595594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:50.8596084Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:51.1017152Z dist init r=1, world=2 2022-09-27T16:27:51.1021361Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:27:51.1022413Z dist init r=0, world=2 2022-09-27T16:27:51.1027902Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:27:51.1028914Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:51.1124128Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:52.4943765Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:27:52.4944329Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:27:52.5184429Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:27:52.5185215Z warnings.warn( 2022-09-27T16:27:52.5186319Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:27:52.5187036Z warnings.warn( 2022-09-27T16:27:52.9971240Z ok (3.812s) 2022-09-27T16:27:53.0003234Z test_wrap_batchnorm_individually_use_or_policy_True (__main__.TestFSDPWrap) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58135 2022-09-27T16:27:53.0010473Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58136 2022-09-27T16:27:54.7120327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:54.7120827Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:54.7121690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:54.7122159Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:54.7205082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:27:54.7205539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:27:54.7208814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:27:54.7209303Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:27:54.9717966Z dist init r=0, world=2 2022-09-27T16:27:54.9721928Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:27:54.9728420Z dist init r=1, world=2 2022-09-27T16:27:54.9733688Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:27:54.9734597Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:54.9825024Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:27:56.3656850Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:27:56.3657404Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:27:56.3905014Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:27:56.3905802Z warnings.warn( 2022-09-27T16:27:56.3906907Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:27:56.3907671Z warnings.warn( 2022-09-27T16:27:56.8088878Z ok (3.812s) 2022-09-27T16:27:56.8089107Z 2022-09-27T16:27:56.8089478Z ---------------------------------------------------------------------- 2022-09-27T16:27:56.8089823Z Ran 46 tests in 95.476s 2022-09-27T16:27:56.8089987Z 2022-09-27T16:27:56.8094412Z OK 2022-09-27T16:27:56.8094640Z 2022-09-27T16:27:56.8094782Z Generating XML reports... 2022-09-27T16:27:56.8141679Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8143036Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8144276Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8145502Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8146967Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8148212Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8149417Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8252353Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8253753Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8366922Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8368161Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8557630Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8558858Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8560074Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8561290Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8562501Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8563836Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8565060Z [W python_variable.cpp:326] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-09-27T16:27:56.8581843Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_wrap/TEST-TestAutoWrap-20220927162621.xml 2022-09-27T16:27:56.8608513Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_wrap/TEST-TestFSDPWrap-20220927162621.xml 2022-09-27T16:27:57.2769417Z Running distributed/fsdp/test_fsdp_misc ... [2022-09-27 16:27:57.276451] 2022-09-27T16:27:57.2770495Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_misc.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:27:57.276531] 2022-09-27T16:27:59.1641217Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_misc 2022-09-27T16:27:59.1661865Z 2022-09-27T16:27:59.1662306Z Running tests... 2022-09-27T16:27:59.1663195Z ---------------------------------------------------------------------- 2022-09-27T16:27:59.1670109Z test_cpu_init_with_sync_module_states (__main__.TestFSDPMisc) 2022-09-27T16:28:00.6551559Z Tests that passing ``sync_module_states=True`` raises an error for ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:28:00.6744106Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58251 2022-09-27T16:28:00.6750454Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58252 2022-09-27T16:28:02.3166722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:02.3167741Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:02.3168950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:02.3169846Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:02.3340753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:02.3341242Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:02.3343833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:02.3344318Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:02.5599256Z dist init r=0, world=2 2022-09-27T16:28:02.5603022Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:28:02.5638187Z dist init r=1, world=2 2022-09-27T16:28:02.5643588Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:28:02.5644586Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:02.5707067Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:03.9726980Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:28:03.9727937Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:28:03.9937305Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:28:03.9939226Z warnings.warn( 2022-09-27T16:28:03.9941536Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:28:03.9943069Z warnings.warn( 2022-09-27T16:28:04.4831608Z ok (5.317s) 2022-09-27T16:28:04.4838955Z test_device_id_auto_wrap (__main__.TestFSDPMisc) 2022-09-27T16:28:04.4865139Z Tests that ``auto_wrap_policy`` propagates ``device_id`` to all ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58332 2022-09-27T16:28:04.4872039Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58333 2022-09-27T16:28:06.0922194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:06.0923170Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:06.0924339Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:06.0925263Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:06.1279659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:06.1280587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:06.1282596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:06.1283550Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:06.3307326Z dist init r=0, world=2 2022-09-27T16:28:06.3312004Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:28:06.3526972Z dist init r=1, world=2 2022-09-27T16:28:06.3532134Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:28:06.3533267Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:06.3617258Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:07.7460726Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:28:07.7461244Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:28:08.1948107Z ok (3.712s) 2022-09-27T16:28:08.1956513Z test_fsdp_cpu_init_stays_on_cpu (__main__.TestFSDPMisc) 2022-09-27T16:28:08.1981587Z Tests that passing a CPU module to FSDP preserves that the wrapped ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58413 2022-09-27T16:28:08.1987726Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58414 2022-09-27T16:28:09.8169474Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:09.8169958Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:09.8170939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:09.8171413Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:09.8451067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:09.8451559Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:09.8454870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:09.8455361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:10.0598512Z dist init r=1, world=2 2022-09-27T16:28:10.0602372Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:28:10.0706647Z dist init r=0, world=2 2022-09-27T16:28:10.0711781Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:28:10.0712911Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:10.0806276Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:11.4872807Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:28:11.4873343Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:28:12.4073492Z ok (4.212s) 2022-09-27T16:28:12.4084002Z test_fsdp_device_id_cpu_offload (__main__.TestFSDPMisc) 2022-09-27T16:28:12.4109510Z Ensures that even if device_id is specified but we have ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58498 2022-09-27T16:28:12.4116866Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58499 2022-09-27T16:28:14.0399446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:14.0399983Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:14.0400837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:14.0401332Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:14.0531969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:14.0532439Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:14.0535255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:14.0535730Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:14.2702150Z dist init r=1, world=2 2022-09-27T16:28:14.2705992Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:28:14.2747853Z dist init r=0, world=2 2022-09-27T16:28:14.2753481Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:28:14.2754961Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:14.2808794Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:15.6500869Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:28:15.6501891Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:28:16.1192720Z ok (3.712s) 2022-09-27T16:28:16.1208498Z test_fsdp_device_id_use_index_False (__main__.TestFSDPMisc) 2022-09-27T16:28:16.1233024Z Tests the FSDP ``device_id`` argument: ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58579 2022-09-27T16:28:16.1239864Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58580 2022-09-27T16:28:17.7404052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:17.7404946Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:17.7405901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:17.7406392Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:17.7698282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:17.7698737Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:17.7701836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:17.7702306Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:17.9742177Z dist init r=0, world=2 2022-09-27T16:28:17.9746244Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:28:17.9946103Z dist init r=1, world=2 2022-09-27T16:28:17.9951520Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:28:17.9952777Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:18.0052159Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:19.3768609Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:28:19.3769133Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:28:19.8330749Z ok (3.714s) 2022-09-27T16:28:19.8344918Z test_fsdp_device_id_use_index_True (__main__.TestFSDPMisc) 2022-09-27T16:28:19.8368947Z Tests the FSDP ``device_id`` argument: ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58660 2022-09-27T16:28:19.8375251Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58661 2022-09-27T16:28:21.4773973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:21.4774504Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:21.4775590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:21.4776069Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:21.5266433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:21.5266892Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:21.5269026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:21.5269499Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:21.7108606Z dist init r=1, world=2 2022-09-27T16:28:21.7112682Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:28:21.7475107Z dist init r=0, world=2 2022-09-27T16:28:21.7480304Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:28:21.7481596Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:21.7519758Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:23.1429176Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:28:23.1429707Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:28:23.6453900Z ok (3.812s) 2022-09-27T16:28:23.6491168Z test_fsdp_module_no_compute_grad_use_second_layer_False_sharding_strategy_None (__main__.TestFSDPMisc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58741 2022-09-27T16:28:23.6497431Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58742 2022-09-27T16:28:25.2877975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:25.2878484Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:25.2879747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:25.2880228Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:25.3301616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:25.3302074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:25.3304753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:25.3305247Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:25.5280011Z dist init r=1, world=2 2022-09-27T16:28:25.5283879Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:28:25.5465469Z dist init r=0, world=2 2022-09-27T16:28:25.5470468Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:28:25.5471520Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:25.5488347Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:26.9655907Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:28:26.9656446Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:28:27.8602083Z ok (4.215s) 2022-09-27T16:28:27.8640729Z test_fsdp_module_no_compute_grad_use_second_layer_False_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestFSDPMisc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58826 2022-09-27T16:28:27.8646825Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58827 2022-09-27T16:28:29.5411182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:29.5412167Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:29.5413343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:29.5414288Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:29.5525821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:29.5526726Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:29.5530131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:29.5531135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:29.7797079Z dist init r=1, world=2 2022-09-27T16:28:29.7797661Z dist init r=0, world=2 2022-09-27T16:28:29.7801377Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:28:29.7802408Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:28:29.7803901Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:29.7805281Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:31.1881068Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:28:31.1882112Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:28:32.0733553Z ok (4.213s) 2022-09-27T16:28:32.0769821Z test_fsdp_module_no_compute_grad_use_second_layer_True_sharding_strategy_None (__main__.TestFSDPMisc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58911 2022-09-27T16:28:32.0776670Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58912 2022-09-27T16:28:33.7208440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:33.7209506Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:33.7210713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:33.7211965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:33.7638919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:33.7639865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:33.7642484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:33.7643444Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:33.9543470Z dist init r=1, world=2 2022-09-27T16:28:33.9547836Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:28:33.9862977Z dist init r=0, world=2 2022-09-27T16:28:33.9868836Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:28:33.9870364Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:33.9954575Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:35.3960395Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:28:35.3960920Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:28:36.2862781Z ok (4.213s) 2022-09-27T16:28:36.2899998Z test_fsdp_module_no_compute_grad_use_second_layer_True_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestFSDPMisc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58996 2022-09-27T16:28:36.2906075Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58997 2022-09-27T16:28:37.9602607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:37.9603461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:37.9604628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:37.9605093Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:37.9712519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:37.9713249Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:37.9716125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:37.9716881Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:38.1932508Z dist init r=1, world=2 2022-09-27T16:28:38.1936957Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:28:38.2019128Z dist init r=0, world=2 2022-09-27T16:28:38.2024619Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:28:38.2026094Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:38.2040060Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:39.5871440Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:28:39.5872286Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:28:40.4992338Z ok (4.213s) 2022-09-27T16:28:40.5025278Z test_fsdp_namedtuple (__main__.TestFSDPMisc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59081 2022-09-27T16:28:40.5031577Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59082 2022-09-27T16:28:42.1322659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:42.1323251Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:42.1324089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:42.1324550Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:42.1525359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:42.1525823Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:42.1529227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:42.1529693Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:42.3676990Z dist init r=1, world=2 2022-09-27T16:28:42.3681291Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:28:42.3770022Z dist init r=0, world=2 2022-09-27T16:28:42.3775180Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:28:42.3776152Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:42.3783778Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:43.7851473Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:28:43.7852009Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:28:44.2106162Z ok (3.711s) 2022-09-27T16:28:44.2143776Z test_fsdp_not_all_outputs_used_in_loss (__main__.TestFSDPMisc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59162 2022-09-27T16:28:44.2149989Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59163 2022-09-27T16:28:45.8277616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:45.8278601Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:45.8279760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:45.8280626Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:45.8540216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:45.8541155Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:45.8542260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:45.8543230Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:46.0714278Z dist init r=1, world=2 2022-09-27T16:28:46.0718765Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:28:46.0804252Z dist init r=0, world=2 2022-09-27T16:28:46.0809556Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:28:46.0810347Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:46.0821337Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:47.4812752Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:28:47.4813292Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:28:47.5088570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:28:47.5089382Z warnings.warn(msg, FutureWarning) 2022-09-27T16:28:47.5090294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844. 2022-09-27T16:28:47.5090950Z warnings.warn(msg, FutureWarning) 2022-09-27T16:28:48.5237727Z ok (4.313s) 2022-09-27T16:28:48.5248465Z test_fsdp_same_model_across_ranks (__main__.TestFSDPMisc) 2022-09-27T16:28:48.5262592Z FSDP broadcasts model from rank 0 to ensure it starts off with the same ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59247 2022-09-27T16:28:48.5269624Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59248 2022-09-27T16:28:50.1724665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:50.1725378Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:50.1725968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:50.1726438Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:50.1862101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:50.1862568Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:50.1865427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:50.1865917Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:50.4105088Z dist init r=1, world=2 2022-09-27T16:28:50.4108852Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:28:50.4150946Z dist init r=0, world=2 2022-09-27T16:28:50.4156408Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:28:50.4157607Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:50.4211697Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:51.8221633Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:28:51.8222177Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:28:52.3345081Z ok (3.811s) 2022-09-27T16:28:52.3350263Z test_module_device_mismatches_device_id (__main__.TestFSDPMisc) 2022-09-27T16:28:52.3364675Z Tests that specifying a ``device_id`` argument to FSDP for a GPU ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59328 2022-09-27T16:28:52.3371155Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59329 2022-09-27T16:28:53.9319354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:53.9320044Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:53.9321200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:53.9321896Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:53.9716110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:53.9716573Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:53.9719701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:53.9720418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:54.1618469Z dist init r=1, world=2 2022-09-27T16:28:54.1622511Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:28:54.1930417Z dist init r=0, world=2 2022-09-27T16:28:54.1935735Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:28:54.1936522Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:54.2028957Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:55.5823173Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:28:55.5823702Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:28:56.0448659Z ok (3.710s) 2022-09-27T16:28:56.0453475Z test_multi_device_not_supported (__main__.TestFSDPMisc) 2022-09-27T16:28:56.0467911Z Tests that wrapping a multi-device module (i.e. with submodules on ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59409 2022-09-27T16:28:56.0474491Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59410 2022-09-27T16:28:57.6745540Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:57.6746407Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:57.6747439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:57.6747938Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:57.6941062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:28:57.6941512Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:28:57.6944447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:28:57.6944919Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:28:57.9159128Z dist init r=0, world=2 2022-09-27T16:28:57.9163458Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:28:57.9252085Z dist init r=1, world=2 2022-09-27T16:28:57.9257441Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:28:57.9258253Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:57.9266143Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:28:59.2990302Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:28:59.2991258Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:28:59.7551870Z ok (3.710s) 2022-09-27T16:28:59.7558332Z test_no_params (__main__.TestFSDPMisc) 2022-09-27T16:28:59.7571936Z Test that device_id and cpu init work if module has no params ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59490 2022-09-27T16:28:59.7578369Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59491 2022-09-27T16:29:01.3645896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:01.3646442Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:01.3647350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:01.3647821Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:01.3667610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:01.3668165Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:01.3672586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:01.3673067Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:01.5899930Z dist init r=0, world=2 2022-09-27T16:29:01.5903560Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:29:01.6017889Z dist init r=1, world=2 2022-09-27T16:29:01.6022842Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:29:01.6024191Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:01.6107798Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:02.9961171Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:29:02.9961693Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:29:03.4654988Z ok (3.710s) 2022-09-27T16:29:03.4655314Z 2022-09-27T16:29:03.4655853Z ---------------------------------------------------------------------- 2022-09-27T16:29:03.4656215Z Ran 16 tests in 64.299s 2022-09-27T16:29:03.4656384Z 2022-09-27T16:29:03.4659198Z OK 2022-09-27T16:29:03.4659635Z 2022-09-27T16:29:03.4659877Z Generating XML reports... 2022-09-27T16:29:03.4725409Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_misc/TEST-TestFSDPMisc-20220927162759.xml 2022-09-27T16:29:03.8409948Z Running distributed/fsdp/test_fsdp_grad_acc ... [2022-09-27 16:29:03.840501] 2022-09-27T16:29:03.8410677Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_grad_acc.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:29:03.840580] 2022-09-27T16:29:05.7280236Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_grad_acc 2022-09-27T16:29:05.7298917Z 2022-09-27T16:29:05.7299354Z Running tests... 2022-09-27T16:29:05.7300007Z ---------------------------------------------------------------------- 2022-09-27T16:29:05.7310355Z test_grad_acc_configs_[(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3)]_cpu_offload_CPUOffload(offload_params=False)_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestGradAcc) 2022-09-27T16:29:07.2678874Z Tests gradient accumulation. ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:29:07.2858512Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59606 2022-09-27T16:29:07.2864950Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59607 2022-09-27T16:29:08.9445322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:08.9445832Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:08.9447148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:08.9447608Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:08.9653419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:08.9653882Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:08.9656724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:08.9657189Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:09.2081460Z dist init r=1, world=2 2022-09-27T16:29:09.2085479Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:29:09.2117814Z dist init r=0, world=2 2022-09-27T16:29:09.2123143Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:29:09.2124634Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:09.2188707Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:10.6010511Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:29:10.6011052Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:29:10.6317419Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:29:10.6318431Z warnings.warn( 2022-09-27T16:29:10.6319846Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:29:10.6320609Z warnings.warn( 2022-09-27T16:29:12.1965057Z ok (6.466s) 2022-09-27T16:29:12.1977208Z test_grad_acc_configs_[(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3)]_cpu_offload_CPUOffload(offload_params=False)_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestGradAcc) 2022-09-27T16:29:12.1992459Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59691 2022-09-27T16:29:12.1999647Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59692 2022-09-27T16:29:13.8155397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:13.8155966Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:13.8157568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:13.8158066Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:13.8436991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:13.8437461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:13.8440097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:13.8440581Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:14.0716442Z dist init r=1, world=2 2022-09-27T16:29:14.0720307Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:29:14.0893572Z dist init r=0, world=2 2022-09-27T16:29:14.0899388Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:29:14.0900227Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:14.0924677Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:15.4343246Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:29:15.4343760Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:29:15.4635460Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:29:15.4636359Z warnings.warn( 2022-09-27T16:29:15.4637474Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:29:15.4638222Z warnings.warn( 2022-09-27T16:29:16.9089695Z ok (4.712s) 2022-09-27T16:29:16.9099554Z test_grad_acc_configs_[(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3)]_cpu_offload_CPUOffload(offload_params=False)_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestGradAcc) 2022-09-27T16:29:16.9113530Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59776 2022-09-27T16:29:16.9120443Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59777 2022-09-27T16:29:18.5353668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:18.5354156Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:18.5355236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:18.5355708Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:18.5600102Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:18.5600540Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:18.5603307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:18.5604002Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:18.7896457Z dist init r=0, world=2 2022-09-27T16:29:18.7899972Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:29:18.8043571Z dist init r=1, world=2 2022-09-27T16:29:18.8048722Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:29:18.8049842Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:18.8104939Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:20.1725496Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:29:20.1726358Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:29:20.2034541Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:29:20.2035312Z warnings.warn( 2022-09-27T16:29:20.2036419Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:29:20.2037159Z warnings.warn( 2022-09-27T16:29:21.7211980Z ok (4.812s) 2022-09-27T16:29:21.7222741Z test_grad_acc_configs_[(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3)]_cpu_offload_CPUOffload(offload_params=True)_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestGradAcc) 2022-09-27T16:29:21.7237104Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59861 2022-09-27T16:29:21.7244003Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59862 2022-09-27T16:29:23.4170630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:23.4171192Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:23.4172218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:23.4172701Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:23.4363624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:23.4364090Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:23.4366653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:23.4367105Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:23.6818148Z dist init r=0, world=2 2022-09-27T16:29:23.6821873Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:29:23.6902708Z dist init r=1, world=2 2022-09-27T16:29:23.6907983Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:29:23.6908783Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:23.6924722Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:25.0644429Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:29:25.0644940Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:29:25.5321623Z ok (3.811s) 2022-09-27T16:29:25.5332038Z test_grad_acc_configs_[(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3)]_cpu_offload_CPUOffload(offload_params=True)_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestGradAcc) 2022-09-27T16:29:25.5345832Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59942 2022-09-27T16:29:25.5352908Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59943 2022-09-27T16:29:27.1676901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:27.1677550Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:27.1678610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:27.1679088Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:27.1847727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:27.1848172Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:27.1851168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:27.1851648Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:27.4260807Z dist init r=1, world=2 2022-09-27T16:29:27.4264701Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:29:27.4315119Z dist init r=0, world=2 2022-09-27T16:29:27.4320727Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:29:27.4321516Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:27.4367637Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:28.8228253Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:29:28.8228794Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:29:29.2428147Z ok (3.711s) 2022-09-27T16:29:29.2438875Z test_grad_acc_configs_[(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3)]_cpu_offload_CPUOffload(offload_params=True)_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestGradAcc) 2022-09-27T16:29:29.2452284Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60023 2022-09-27T16:29:29.2459096Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60024 2022-09-27T16:29:30.8704742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:30.8705437Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:30.8706046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:30.8706513Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:30.8801350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:30.8801831Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:30.8804823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:30.8805327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:31.1196884Z dist init r=0, world=2 2022-09-27T16:29:31.1200959Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:29:31.1254619Z dist init r=1, world=2 2022-09-27T16:29:31.1259795Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:29:31.1260773Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:31.1303760Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:32.4927974Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:29:32.4928518Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:29:32.9533915Z ok (3.711s) 2022-09-27T16:29:32.9544442Z test_grad_acc_configs_[(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3)]_cpu_offload_CPUOffload(offload_params=False)_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestGradAcc) 2022-09-27T16:29:32.9558365Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60104 2022-09-27T16:29:32.9564846Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60105 2022-09-27T16:29:34.5726695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:34.5727546Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:34.5728422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:34.5728913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:34.6129924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:34.6130370Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:34.6133472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:34.6133947Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:34.8220451Z dist init r=1, world=2 2022-09-27T16:29:34.8224181Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:29:34.8550256Z dist init r=0, world=2 2022-09-27T16:29:34.8555890Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:29:34.8556694Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:34.8631129Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:36.2275518Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:29:36.2276057Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:29:36.2599625Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:29:36.2600446Z warnings.warn( 2022-09-27T16:29:36.2603899Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:29:36.2604685Z warnings.warn( 2022-09-27T16:29:37.8660084Z ok (4.912s) 2022-09-27T16:29:37.8670213Z test_grad_acc_configs_[(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3)]_cpu_offload_CPUOffload(offload_params=False)_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestGradAcc) 2022-09-27T16:29:37.8684590Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60189 2022-09-27T16:29:37.8691390Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60190 2022-09-27T16:29:39.4978783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:39.4979536Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:39.4980652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:39.4981141Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:39.5173588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:39.5174055Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:39.5177211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:39.5177684Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:39.7633936Z dist init r=0, world=2 2022-09-27T16:29:39.7635356Z dist init r=1, world=2 2022-09-27T16:29:39.7638286Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:29:39.7641257Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:29:39.7642125Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:39.7741425Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:41.1153576Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:29:41.1154082Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:29:41.1476347Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:29:41.1477134Z warnings.warn( 2022-09-27T16:29:41.1481108Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:29:41.1481859Z warnings.warn( 2022-09-27T16:29:42.5789152Z ok (4.713s) 2022-09-27T16:29:42.5799722Z test_grad_acc_configs_[(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3)]_cpu_offload_CPUOffload(offload_params=False)_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestGradAcc) 2022-09-27T16:29:42.5812456Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60274 2022-09-27T16:29:42.5818632Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60275 2022-09-27T16:29:44.2377971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:44.2378482Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:44.2379663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:44.2380126Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:44.2668176Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:44.2668649Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:44.2671728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:44.2672182Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:44.4995858Z dist init r=0, world=2 2022-09-27T16:29:44.5000508Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:29:44.5138326Z dist init r=1, world=2 2022-09-27T16:29:44.5143688Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:29:44.5144588Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:44.5204973Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:45.8982746Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:29:45.8983271Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:29:45.9277814Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:29:45.9278689Z warnings.warn( 2022-09-27T16:29:45.9280046Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:29:45.9280800Z warnings.warn( 2022-09-27T16:29:47.3914711Z ok (4.812s) 2022-09-27T16:29:47.3925205Z test_grad_acc_configs_[(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3)]_cpu_offload_CPUOffload(offload_params=True)_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestGradAcc) 2022-09-27T16:29:47.3937994Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60359 2022-09-27T16:29:47.3944298Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60360 2022-09-27T16:29:49.0195634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:49.0196163Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:49.0197190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:49.0197699Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:49.0845152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:49.0845633Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:49.0846962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:49.0847433Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:49.2708147Z dist init r=0, world=2 2022-09-27T16:29:49.2712244Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:29:49.3236922Z dist init r=1, world=2 2022-09-27T16:29:49.3242268Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:29:49.3243381Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:49.3322096Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:50.6833944Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:29:50.6834477Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:29:51.1019494Z ok (3.710s) 2022-09-27T16:29:51.1029647Z test_grad_acc_configs_[(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3)]_cpu_offload_CPUOffload(offload_params=True)_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestGradAcc) 2022-09-27T16:29:51.1043261Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60440 2022-09-27T16:29:51.1049511Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60441 2022-09-27T16:29:52.7727655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:52.7728137Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:52.7729251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:52.7729726Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:52.8053638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:52.8054074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:52.8057132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:52.8057608Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:53.0366765Z dist init r=0, world=2 2022-09-27T16:29:53.0370419Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:29:53.0514849Z dist init r=1, world=2 2022-09-27T16:29:53.0520201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:29:53.0521095Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:53.0574787Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:54.4226696Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:29:54.4227241Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:29:54.9127080Z ok (3.811s) 2022-09-27T16:29:54.9138220Z test_grad_acc_configs_[(use_no_sync=True,num_iters=3),(use_no_sync=False,num_iters=3),(use_no_sync=True,num_iters=3)]_cpu_offload_CPUOffload(offload_params=True)_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestGradAcc) 2022-09-27T16:29:54.9152191Z Tests gradient accumulation. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60521 2022-09-27T16:29:54.9159005Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60522 2022-09-27T16:29:56.5407453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:56.5408319Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:56.5408907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:56.5409642Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:56.5729775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:29:56.5730238Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:29:56.5733431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:29:56.5733912Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:29:56.7997550Z dist init r=0, world=2 2022-09-27T16:29:56.8001396Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:29:56.8187223Z dist init r=1, world=2 2022-09-27T16:29:56.8192779Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:29:56.8193906Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:56.8205247Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:29:58.1857334Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:29:58.1857862Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:29:58.6236640Z ok (3.711s) 2022-09-27T16:29:58.6236859Z 2022-09-27T16:29:58.6237251Z ---------------------------------------------------------------------- 2022-09-27T16:29:58.6237594Z Ran 12 tests in 52.894s 2022-09-27T16:29:58.6237742Z 2022-09-27T16:29:58.6239560Z OK 2022-09-27T16:29:58.6250669Z 2022-09-27T16:29:58.6250871Z Generating XML reports... 2022-09-27T16:29:58.6304011Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_grad_acc/TEST-TestGradAcc-20220927162905.xml 2022-09-27T16:29:58.9889597Z Running distributed/test_c10d_spawn_nccl ... [2022-09-27 16:29:58.988464] 2022-09-27T16:29:58.9890385Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_spawn_nccl.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:29:58.988547] 2022-09-27T16:30:00.8395888Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4v6ni7b8 2022-09-27T16:30:00.8400429Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4v6ni7b8/_remote_module_non_scriptable.py 2022-09-27T16:30:02.3375261Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:02.3423769Z 2022-09-27T16:30:02.3424112Z 2022-09-27T16:30:02.3425717Z , <__main__.TestDistributedNNFunctionsNccl testMethod=test_all_gather_base>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_all_to_all>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_all_to_all_single>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_allreduce>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_broadcast>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_reduce>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_reduce_scatter>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_reduce_scatter_non_contiguous>]> 2022-09-27T16:30:02.3427124Z test_all_gather (__main__.TestDistributedNNFunctionsNccl) 2022-09-27T16:30:02.3427551Z test_all_gather_base (__main__.TestDistributedNNFunctionsNccl) 2022-09-27T16:30:02.3427936Z test_all_to_all (__main__.TestDistributedNNFunctionsNccl) 2022-09-27T16:30:02.3428346Z test_all_to_all_single (__main__.TestDistributedNNFunctionsNccl) 2022-09-27T16:30:02.3428759Z test_allreduce (__main__.TestDistributedNNFunctionsNccl) 2022-09-27T16:30:02.3429154Z test_broadcast (__main__.TestDistributedNNFunctionsNccl) 2022-09-27T16:30:02.3429639Z test_reduce (__main__.TestDistributedNNFunctionsNccl) 2022-09-27T16:30:02.3430045Z test_reduce_scatter (__main__.TestDistributedNNFunctionsNccl) 2022-09-27T16:30:02.3430484Z test_reduce_scatter_non_contiguous (__main__.TestDistributedNNFunctionsNccl) 2022-09-27T16:30:03.9058004Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:03.9058584Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:03.9059992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:03.9060477Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:04.1350612Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3y10oukq 2022-09-27T16:30:04.1351996Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3y10oukq/_remote_module_non_scriptable.py 2022-09-27T16:30:05.5854816Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:05.5922735Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-09-27T16:30:05.5938007Z 2022-09-27T16:30:05.5938236Z Running tests... 2022-09-27T16:30:05.5938647Z ---------------------------------------------------------------------- 2022-09-27T16:30:05.6554221Z test_all_gather (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60671 2022-09-27T16:30:05.6560616Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60672 2022-09-27T16:30:07.2901172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:07.2901907Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:07.2903458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:07.2903976Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:07.2915218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:07.2915926Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:07.2920107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:07.2920607Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:07.5169744Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwwzh395i 2022-09-27T16:30:07.5171246Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwwzh395i/_remote_module_non_scriptable.py 2022-09-27T16:30:07.5258778Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzsf3b22e 2022-09-27T16:30:07.5261259Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzsf3b22e/_remote_module_non_scriptable.py 2022-09-27T16:30:09.0635674Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:09.0686282Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:30:09.0690230Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:30:09.0727241Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:09.0778719Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:30:09.0783155Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:30:09.0784445Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:30:09.0792963Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:30:10.7666177Z ok (5.172s) 2022-09-27T16:30:10.7666575Z 2022-09-27T16:30:10.7667218Z ---------------------------------------------------------------------- 2022-09-27T16:30:10.7667856Z Ran 1 test in 5.173s 2022-09-27T16:30:10.7668154Z 2022-09-27T16:30:10.7668305Z OK 2022-09-27T16:30:10.7668620Z 2022-09-27T16:30:10.7668866Z Generating XML reports... 2022-09-27T16:30:10.7706918Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163005.xml 2022-09-27T16:30:12.7329716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:12.7330230Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:12.7332219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:12.7332702Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:12.9615167Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppvz17p_2 2022-09-27T16:30:12.9616615Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppvz17p_2/_remote_module_non_scriptable.py 2022-09-27T16:30:14.4199194Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:14.4265577Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-09-27T16:30:14.4281046Z 2022-09-27T16:30:14.4281297Z Running tests... 2022-09-27T16:30:14.4281747Z ---------------------------------------------------------------------- 2022-09-27T16:30:14.4938043Z test_all_gather_base (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60792 2022-09-27T16:30:14.4943364Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60793 2022-09-27T16:30:16.1095275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:16.1095803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:16.1097873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:16.1098381Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:16.1292513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:16.1292988Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:16.1296693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:16.1297170Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:16.3574570Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsh1x6ak4 2022-09-27T16:30:16.3575716Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsh1x6ak4/_remote_module_non_scriptable.py 2022-09-27T16:30:16.3636699Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfbprx56k 2022-09-27T16:30:16.3639764Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfbprx56k/_remote_module_non_scriptable.py 2022-09-27T16:30:17.9140927Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:17.9181489Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:17.9190374Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:30:17.9195023Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:30:17.9229178Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:30:17.9233114Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:30:17.9234342Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:30:17.9299826Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:30:19.6045664Z ok (5.176s) 2022-09-27T16:30:19.6045896Z 2022-09-27T16:30:19.6046282Z ---------------------------------------------------------------------- 2022-09-27T16:30:19.6046622Z Ran 1 test in 5.176s 2022-09-27T16:30:19.6046785Z 2022-09-27T16:30:19.6046859Z OK 2022-09-27T16:30:19.6046998Z 2022-09-27T16:30:19.6047135Z Generating XML reports... 2022-09-27T16:30:19.6081513Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163014.xml 2022-09-27T16:30:21.5584820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:21.5585838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:21.5587557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:21.5588534Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:21.7866532Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxztks1vl 2022-09-27T16:30:21.7867422Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxztks1vl/_remote_module_non_scriptable.py 2022-09-27T16:30:23.2521369Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:23.2590164Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-09-27T16:30:23.2605944Z 2022-09-27T16:30:23.2606200Z Running tests... 2022-09-27T16:30:23.2606626Z ---------------------------------------------------------------------- 2022-09-27T16:30:23.3211140Z test_all_to_all (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60913 2022-09-27T16:30:23.3216121Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60914 2022-09-27T16:30:24.9533209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:24.9533720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:24.9536641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:24.9537132Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:24.9883699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:24.9884166Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:24.9888222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:24.9888717Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:25.1901407Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4qqhlaqp 2022-09-27T16:30:25.1902657Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4qqhlaqp/_remote_module_non_scriptable.py 2022-09-27T16:30:25.2124465Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfkaqq5j1 2022-09-27T16:30:25.2125750Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfkaqq5j1/_remote_module_non_scriptable.py 2022-09-27T16:30:26.7139852Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:26.7188448Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:30:26.7192928Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:30:26.7498594Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:26.7546605Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:30:26.7550904Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:30:26.7551922Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:30:26.7599955Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:30:28.4318600Z ok (5.171s) 2022-09-27T16:30:28.4318792Z 2022-09-27T16:30:28.4319468Z ---------------------------------------------------------------------- 2022-09-27T16:30:28.4319822Z Ran 1 test in 5.171s 2022-09-27T16:30:28.4319996Z 2022-09-27T16:30:28.4320100Z OK 2022-09-27T16:30:28.4320244Z 2022-09-27T16:30:28.4320363Z Generating XML reports... 2022-09-27T16:30:28.4357362Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163023.xml 2022-09-27T16:30:30.4257780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:30.4258307Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:30.4260512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:30.4261003Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:30.6653201Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptvizf0uh 2022-09-27T16:30:30.6654389Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptvizf0uh/_remote_module_non_scriptable.py 2022-09-27T16:30:32.1501536Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:32.1569499Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-09-27T16:30:32.1585304Z 2022-09-27T16:30:32.1585655Z Running tests... 2022-09-27T16:30:32.1586111Z ---------------------------------------------------------------------- 2022-09-27T16:30:32.2194363Z test_all_to_all_single (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61036 2022-09-27T16:30:32.2200804Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61037 2022-09-27T16:30:33.8226082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:33.8226579Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:33.8229222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:33.8229700Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:33.8232849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:33.8233326Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:33.8239804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:33.8240302Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:34.0594187Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0ld0ub7m 2022-09-27T16:30:34.0595236Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0ld0ub7m/_remote_module_non_scriptable.py 2022-09-27T16:30:34.0620292Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpodhsl_cr 2022-09-27T16:30:34.0622850Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpodhsl_cr/_remote_module_non_scriptable.py 2022-09-27T16:30:35.5770769Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:35.5818971Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:30:35.5823754Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:30:35.5896515Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:35.5945686Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:30:35.5949984Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:30:35.5951159Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:30:35.6028736Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:30:37.2304452Z ok (5.071s) 2022-09-27T16:30:37.2304661Z 2022-09-27T16:30:37.2305072Z ---------------------------------------------------------------------- 2022-09-27T16:30:37.2305389Z Ran 1 test in 5.072s 2022-09-27T16:30:37.2305552Z 2022-09-27T16:30:37.2305667Z OK 2022-09-27T16:30:37.2305804Z 2022-09-27T16:30:37.2305937Z Generating XML reports... 2022-09-27T16:30:37.2343557Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163032.xml 2022-09-27T16:30:39.1525198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:39.1525705Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:39.1528730Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:39.1529211Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:39.3884314Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbwivt0e3 2022-09-27T16:30:39.3885581Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbwivt0e3/_remote_module_non_scriptable.py 2022-09-27T16:30:40.8757616Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:40.8823794Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-09-27T16:30:40.8843759Z 2022-09-27T16:30:40.8844042Z Running tests... 2022-09-27T16:30:40.8844489Z ---------------------------------------------------------------------- 2022-09-27T16:30:40.9462409Z test_allreduce (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61159 2022-09-27T16:30:40.9467912Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61160 2022-09-27T16:30:42.5270923Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:42.5271455Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:42.5272822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:42.5273318Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:42.5467866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:42.5468353Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:42.5472028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:42.5472503Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:42.7677246Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpph6o5baw 2022-09-27T16:30:42.7677853Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpph6o5baw/_remote_module_non_scriptable.py 2022-09-27T16:30:42.7774720Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5kbhx_ag 2022-09-27T16:30:42.7777692Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5kbhx_ag/_remote_module_non_scriptable.py 2022-09-27T16:30:44.2960788Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:44.3009079Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:30:44.3013089Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:30:44.3200572Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:44.3248152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:30:44.3252393Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:30:44.3253187Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:30:44.3318816Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:30:45.9591355Z ok (5.074s) 2022-09-27T16:30:45.9591562Z 2022-09-27T16:30:45.9592261Z ---------------------------------------------------------------------- 2022-09-27T16:30:45.9592591Z Ran 1 test in 5.075s 2022-09-27T16:30:45.9592765Z 2022-09-27T16:30:45.9592878Z OK 2022-09-27T16:30:45.9593019Z 2022-09-27T16:30:45.9593155Z Generating XML reports... 2022-09-27T16:30:45.9629855Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163040.xml 2022-09-27T16:30:47.8959067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:47.8959562Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:47.8962058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:47.8962545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:48.1305222Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgzn943nv 2022-09-27T16:30:48.1306383Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgzn943nv/_remote_module_non_scriptable.py 2022-09-27T16:30:49.6251580Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:49.6320468Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-09-27T16:30:49.6336446Z 2022-09-27T16:30:49.6336740Z Running tests... 2022-09-27T16:30:49.6337176Z ---------------------------------------------------------------------- 2022-09-27T16:30:49.6983106Z test_broadcast (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61280 2022-09-27T16:30:49.6989256Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61281 2022-09-27T16:30:51.3053803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:51.3054322Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:51.3057030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:51.3057540Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:51.3111315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:51.3112068Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:51.3116124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:51.3116600Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:51.5282383Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_nbsri94 2022-09-27T16:30:51.5283501Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_nbsri94/_remote_module_non_scriptable.py 2022-09-27T16:30:51.5411880Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkpkrk878 2022-09-27T16:30:51.5414767Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkpkrk878/_remote_module_non_scriptable.py 2022-09-27T16:30:53.0542305Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:53.0591490Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:30:53.0595719Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:30:53.0607714Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:53.0654384Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:30:53.0658253Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:30:53.0659035Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:30:53.0698729Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:30:54.7110651Z ok (5.077s) 2022-09-27T16:30:54.7110862Z 2022-09-27T16:30:54.7111497Z ---------------------------------------------------------------------- 2022-09-27T16:30:54.7111823Z Ran 1 test in 5.077s 2022-09-27T16:30:54.7111998Z 2022-09-27T16:30:54.7112090Z OK 2022-09-27T16:30:54.7112228Z 2022-09-27T16:30:54.7112361Z Generating XML reports... 2022-09-27T16:30:54.7147449Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163049.xml 2022-09-27T16:30:56.6427869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:30:56.6428375Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:30:56.6430840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:30:56.6431610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:30:56.8709881Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprjfq94y8 2022-09-27T16:30:56.8711438Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprjfq94y8/_remote_module_non_scriptable.py 2022-09-27T16:30:58.3279959Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:30:58.3345311Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-09-27T16:30:58.3360890Z 2022-09-27T16:30:58.3361007Z Running tests... 2022-09-27T16:30:58.3361758Z ---------------------------------------------------------------------- 2022-09-27T16:30:58.3965157Z test_reduce (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61401 2022-09-27T16:30:58.3970768Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61402 2022-09-27T16:31:00.0183877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:00.0184937Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:00.0186658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:00.0187633Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:00.0399469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:00.0400366Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:00.0403702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:00.0404642Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:00.2661261Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3d0o6xxo 2022-09-27T16:31:00.2662372Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3d0o6xxo/_remote_module_non_scriptable.py 2022-09-27T16:31:00.2701814Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp532mp2gp 2022-09-27T16:31:00.2704237Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp532mp2gp/_remote_module_non_scriptable.py 2022-09-27T16:31:01.8290126Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:31:01.8342723Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:31:01.8346931Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:31:01.8360630Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:31:01.8412044Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:31:01.8416091Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:31:01.8417174Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:01.8450297Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:03.5073428Z ok (5.171s) 2022-09-27T16:31:03.5073671Z 2022-09-27T16:31:03.5074084Z ---------------------------------------------------------------------- 2022-09-27T16:31:03.5074435Z Ran 1 test in 5.171s 2022-09-27T16:31:03.5074598Z 2022-09-27T16:31:03.5074692Z OK 2022-09-27T16:31:03.5074836Z 2022-09-27T16:31:03.5074957Z Generating XML reports... 2022-09-27T16:31:03.5109697Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163058.xml 2022-09-27T16:31:05.4729188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:05.4729672Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:05.4732648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:05.4733135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:05.7096760Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjii526xr 2022-09-27T16:31:05.7098696Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjii526xr/_remote_module_non_scriptable.py 2022-09-27T16:31:07.1886555Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:31:07.1953872Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-09-27T16:31:07.1969920Z 2022-09-27T16:31:07.1970409Z Running tests... 2022-09-27T16:31:07.1970903Z ---------------------------------------------------------------------- 2022-09-27T16:31:07.2634263Z test_reduce_scatter (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61522 2022-09-27T16:31:07.2640979Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61523 2022-09-27T16:31:08.8559991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:08.8560494Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:08.8562598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:08.8563062Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:08.8762060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:08.8762527Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:08.8766941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:08.8767410Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:09.0929174Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo1o1uymf 2022-09-27T16:31:09.0929998Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo1o1uymf/_remote_module_non_scriptable.py 2022-09-27T16:31:09.1039207Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx6q374of 2022-09-27T16:31:09.1042598Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx6q374of/_remote_module_non_scriptable.py 2022-09-27T16:31:10.6288431Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:31:10.6336715Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:31:10.6340743Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:31:10.6673335Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:31:10.6722437Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:31:10.6726777Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:31:10.6727568Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:10.6747094Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:12.2743169Z ok (5.077s) 2022-09-27T16:31:12.2743407Z 2022-09-27T16:31:12.2743802Z ---------------------------------------------------------------------- 2022-09-27T16:31:12.2744141Z Ran 1 test in 5.077s 2022-09-27T16:31:12.2744302Z 2022-09-27T16:31:12.2744379Z OK 2022-09-27T16:31:12.2744516Z 2022-09-27T16:31:12.2744667Z Generating XML reports... 2022-09-27T16:31:12.2780369Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163107.xml 2022-09-27T16:31:14.2470028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:14.2470552Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:14.2473357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:14.2473894Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:14.4745592Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfp96rc45 2022-09-27T16:31:14.4746891Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfp96rc45/_remote_module_non_scriptable.py 2022-09-27T16:31:15.9314820Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:31:15.9382445Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-09-27T16:31:15.9398258Z 2022-09-27T16:31:15.9398542Z Running tests... 2022-09-27T16:31:15.9399212Z ---------------------------------------------------------------------- 2022-09-27T16:31:16.0009943Z test_reduce_scatter_non_contiguous (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61643 2022-09-27T16:31:16.0015052Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61644 2022-09-27T16:31:17.6032040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:17.6032569Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:17.6035101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:17.6035565Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:17.6134822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:17.6135288Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:17.6139094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:17.6139549Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:17.8319099Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0lhkdm64 2022-09-27T16:31:17.8319683Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0lhkdm64/_remote_module_non_scriptable.py 2022-09-27T16:31:17.8449834Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqotjfm3u 2022-09-27T16:31:17.8452859Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqotjfm3u/_remote_module_non_scriptable.py 2022-09-27T16:31:19.3757729Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:31:19.3805839Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:31:19.3809942Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:31:19.3928897Z INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:31:19.3977574Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:31:19.3981637Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:31:19.3982438Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:19.4014167Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:21.0113239Z ok (5.071s) 2022-09-27T16:31:21.0113475Z 2022-09-27T16:31:21.0113876Z ---------------------------------------------------------------------- 2022-09-27T16:31:21.0114221Z Ran 1 test in 5.071s 2022-09-27T16:31:21.0114386Z 2022-09-27T16:31:21.0114479Z OK 2022-09-27T16:31:21.0116135Z 2022-09-27T16:31:21.0116440Z Generating XML reports... 2022-09-27T16:31:21.0150431Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163115.xml 2022-09-27T16:31:21.7474843Z Running distributed/fsdp/test_fsdp_freezing_weights ... [2022-09-27 16:31:21.746932] 2022-09-27T16:31:21.7475658Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_freezing_weights.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:31:21.747018] 2022-09-27T16:31:23.6541121Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_freezing_weights 2022-09-27T16:31:23.6559068Z 2022-09-27T16:31:23.6559443Z Running tests... 2022-09-27T16:31:23.6559951Z ---------------------------------------------------------------------- 2022-09-27T16:31:25.1996441Z test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False (__main__.TestFreezingWeights) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:31:25.2182381Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61764 2022-09-27T16:31:25.2190801Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61765 2022-09-27T16:31:26.8789371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:26.8789884Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:26.8790960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:26.8791430Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:26.8839807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:26.8840268Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:26.8842894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:26.8843382Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:27.1303237Z dist init r=0, world=2 2022-09-27T16:31:27.1307708Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:31:27.1344048Z dist init r=1, world=2 2022-09-27T16:31:27.1350087Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:31:27.1351139Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:27.1410934Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:28.5089081Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:31:28.5089768Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:31:29.7497869Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:31:29.7498411Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:31:30.4296774Z ok (6.773s) 2022-09-27T16:31:30.4318410Z test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61849 2022-09-27T16:31:30.4325806Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61850 2022-09-27T16:31:32.1178939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:32.1179435Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:32.1180891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:32.1181353Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:32.1435188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:32.1435656Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:32.1438575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:32.1439036Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:32.3758163Z dist init r=0, world=2 2022-09-27T16:31:32.3762281Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:31:32.3869113Z dist init r=1, world=2 2022-09-27T16:31:32.3874983Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:31:32.3875787Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:32.3966839Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:33.7470343Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:31:33.7471222Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:31:34.9693883Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:31:34.9694419Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:31:34.9799048Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:31:34.9799836Z warnings.warn( 2022-09-27T16:31:34.9800940Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:31:34.9801695Z warnings.warn( 2022-09-27T16:31:35.6429200Z ok (5.213s) 2022-09-27T16:31:35.6449757Z test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61934 2022-09-27T16:31:35.6456041Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61935 2022-09-27T16:31:37.2246742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:37.2247248Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:37.2248349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:37.2248810Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:37.3036080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:37.3036585Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:37.3037959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:37.3038414Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:37.4816687Z dist init r=0, world=2 2022-09-27T16:31:37.4820920Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:31:37.5432873Z dist init r=1, world=2 2022-09-27T16:31:37.5438289Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:31:37.5439074Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:37.5531450Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:38.9217781Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:31:38.9218596Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:31:40.1003914Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:31:40.1006399Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:31:40.7578300Z ok (5.115s) 2022-09-27T16:31:40.7598404Z test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62019 2022-09-27T16:31:40.7604857Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62020 2022-09-27T16:31:42.4074844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:42.4075573Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:42.4076401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:42.4076882Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:42.4412342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:42.4412804Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:42.4416043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:42.4416515Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:42.6610462Z dist init r=1, world=2 2022-09-27T16:31:42.6614602Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:31:42.6833960Z dist init r=0, world=2 2022-09-27T16:31:42.6839512Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:31:42.6840260Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:42.6920840Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:44.0836806Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:31:44.0837777Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:31:45.2436127Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:31:45.2437165Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:31:45.2515942Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:31:45.2517422Z warnings.warn( 2022-09-27T16:31:45.2519668Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:31:45.2521106Z warnings.warn( 2022-09-27T16:31:45.8718709Z ok (5.114s) 2022-09-27T16:31:45.8738605Z test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62104 2022-09-27T16:31:45.8744626Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62105 2022-09-27T16:31:47.4950049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:47.4950560Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:47.4951362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:47.4951823Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:47.5267641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:47.5268336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:47.5271324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:47.5272017Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:47.7491517Z dist init r=0, world=2 2022-09-27T16:31:47.7495381Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:31:47.7703231Z dist init r=1, world=2 2022-09-27T16:31:47.7708767Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:31:47.7709540Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:47.7801375Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:49.1478731Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:31:49.1479285Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:31:50.4247061Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:31:50.4247644Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:31:51.1852906Z ok (5.313s) 2022-09-27T16:31:51.1873863Z test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62189 2022-09-27T16:31:51.1880310Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62190 2022-09-27T16:31:52.8563903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:52.8564446Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:52.8565481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:52.8566005Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:52.8597421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:52.8597888Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:52.8600752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:52.8601232Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:53.0985078Z dist init r=0, world=2 2022-09-27T16:31:53.0989049Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:31:53.1085354Z dist init r=1, world=2 2022-09-27T16:31:53.1090705Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:31:53.1091711Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:53.1092413Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:54.4599261Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:31:54.4599783Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:31:55.7385094Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:31:55.7385624Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:31:55.7676380Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:31:55.7677644Z warnings.warn( 2022-09-27T16:31:55.7679890Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:31:55.7680832Z warnings.warn( 2022-09-27T16:31:56.4988761Z ok (5.313s) 2022-09-27T16:31:56.5009769Z test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62304 2022-09-27T16:31:56.5015663Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62305 2022-09-27T16:31:58.1742537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:58.1743088Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:58.1744935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:58.1745738Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:58.1998605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:31:58.1999441Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:31:58.2002215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:31:58.2003037Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:31:58.4345804Z dist init r=0, world=2 2022-09-27T16:31:58.4349809Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:31:58.4466154Z dist init r=1, world=2 2022-09-27T16:31:58.4471381Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:31:58.4472872Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:58.4555237Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:31:59.8261973Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:31:59.8262843Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:32:01.0358153Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:32:01.0358770Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:32:01.7117720Z ok (5.213s) 2022-09-27T16:32:01.7137626Z test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True (__main__.TestFreezingWeights) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62389 2022-09-27T16:32:01.7143456Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62390 2022-09-27T16:32:03.4138758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:03.4139273Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:03.4140870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:03.4141359Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:03.4374661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:03.4375155Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:03.4377399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:03.4377879Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:03.6834785Z dist init r=0, world=2 2022-09-27T16:32:03.6839318Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:32:03.6928806Z dist init r=1, world=2 2022-09-27T16:32:03.6934131Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:32:03.6935234Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:03.6942367Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:05.0771670Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:32:05.0772212Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:32:06.2655017Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:32:06.2655831Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:32:06.2765862Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:32:06.2766670Z warnings.warn( 2022-09-27T16:32:06.2767766Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:32:06.2768512Z warnings.warn( 2022-09-27T16:32:07.0251248Z ok (5.313s) 2022-09-27T16:32:07.0251581Z 2022-09-27T16:32:07.0252257Z ---------------------------------------------------------------------- 2022-09-27T16:32:07.0252895Z Ran 8 tests in 43.369s 2022-09-27T16:32:07.0253214Z 2022-09-27T16:32:07.0253766Z OK 2022-09-27T16:32:07.0254037Z 2022-09-27T16:32:07.0254383Z Generating XML reports... 2022-09-27T16:32:07.0305246Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_freezing_weights/TEST-TestFreezingWeights-20220927163123.xml 2022-09-27T16:32:07.4148075Z Running distributed/fsdp/test_fsdp_comm ... [2022-09-27 16:32:07.414307] 2022-09-27T16:32:07.4148817Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_comm.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:32:07.414386] 2022-09-27T16:32:09.3266450Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_comm 2022-09-27T16:32:09.3284696Z 2022-09-27T16:32:09.3284858Z Running tests... 2022-09-27T16:32:09.3285680Z ---------------------------------------------------------------------- 2022-09-27T16:32:09.3304007Z test_communication_nested_model_False_use_no_sync_False_sharding_strategy_None (__main__.TestCommunication) 2022-09-27T16:32:10.9049395Z Tests FSDP's communication cost in terms of calls to collective ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:32:10.9239342Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62539 2022-09-27T16:32:10.9245300Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62540 2022-09-27T16:32:12.5375539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:12.5376043Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:12.5379078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:12.5379567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:12.5835391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:12.5835888Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:12.5840055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:12.5840523Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:12.7935798Z dist init r=0, world=2 2022-09-27T16:32:12.7972005Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:32:12.8309415Z dist init r=1, world=2 2022-09-27T16:32:12.8314546Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:32:12.8315722Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:12.8379208Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:14.2306641Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:32:14.2307188Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:32:15.5342472Z ok (6.205s) 2022-09-27T16:32:15.5361022Z test_communication_nested_model_False_use_no_sync_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunication) 2022-09-27T16:32:15.5375174Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62624 2022-09-27T16:32:15.5381321Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62625 2022-09-27T16:32:17.2410032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:17.2410538Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:17.2413657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:17.2414331Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:17.2529484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:17.2529952Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:17.2533716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:17.2534186Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:17.4945003Z dist init r=0, world=2 2022-09-27T16:32:17.4949145Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:32:17.5076723Z dist init r=1, world=2 2022-09-27T16:32:17.5081854Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:32:17.5082657Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:17.5153480Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:18.9090309Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:32:18.9090835Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:32:20.2481432Z ok (4.714s) 2022-09-27T16:32:20.2499590Z test_communication_nested_model_False_use_no_sync_True_sharding_strategy_None (__main__.TestCommunication) 2022-09-27T16:32:20.2513998Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62709 2022-09-27T16:32:20.2520473Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62710 2022-09-27T16:32:21.8592518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:21.8593221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:21.8595549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:21.8596031Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:21.9491910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:21.9492394Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:21.9495586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:21.9496069Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:22.1106204Z dist init r=0, world=2 2022-09-27T16:32:22.1109992Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:32:22.1883058Z dist init r=1, world=2 2022-09-27T16:32:22.1888408Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:32:22.1889210Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:22.1921631Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:23.5482933Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:32:23.5483488Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:32:24.8615729Z ok (4.613s) 2022-09-27T16:32:24.8635091Z test_communication_nested_model_False_use_no_sync_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunication) 2022-09-27T16:32:24.8649948Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62794 2022-09-27T16:32:24.8655687Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62795 2022-09-27T16:32:26.4800351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:26.4800853Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:26.4803122Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:26.4803606Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:26.5144458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:26.5145167Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:26.5149205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:26.5149689Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:26.7328158Z dist init r=0, world=2 2022-09-27T16:32:26.7332015Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:32:26.7569142Z dist init r=1, world=2 2022-09-27T16:32:26.7574644Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:32:26.7575425Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:26.7638467Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:28.1541592Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:32:28.1542154Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:32:29.5752507Z ok (4.714s) 2022-09-27T16:32:29.5770390Z test_communication_nested_model_True_use_no_sync_False_sharding_strategy_None (__main__.TestCommunication) 2022-09-27T16:32:29.5784963Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62879 2022-09-27T16:32:29.5790986Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62880 2022-09-27T16:32:31.2242886Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:31.2243388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:31.2246152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:31.2246654Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:31.3069301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:31.3069793Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:31.3073741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:31.3074218Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:31.4602064Z dist init r=1, world=2 2022-09-27T16:32:31.4606312Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:32:31.5497364Z dist init r=0, world=2 2022-09-27T16:32:31.5503293Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:32:31.5504408Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:31.5518982Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:32.9238975Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:32:32.9239491Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:32:32.9455700Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:32:32.9456806Z warnings.warn( 2022-09-27T16:32:32.9457938Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:32:32.9458667Z warnings.warn( 2022-09-27T16:32:33.8875553Z ok (4.312s) 2022-09-27T16:32:33.8896783Z test_communication_nested_model_True_use_no_sync_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunication) 2022-09-27T16:32:33.8910433Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62964 2022-09-27T16:32:33.8916746Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62965 2022-09-27T16:32:35.5311105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:35.5311636Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:35.5314409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:35.5315073Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:35.5570014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:35.5570743Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:35.5574674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:35.5575382Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:35.7932579Z dist init r=0, world=2 2022-09-27T16:32:35.7936732Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:32:35.7951637Z dist init r=1, world=2 2022-09-27T16:32:35.7957087Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:32:35.7957898Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:35.8039538Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:37.1394843Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:32:37.1395410Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:32:37.1656904Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:32:37.1657717Z warnings.warn( 2022-09-27T16:32:37.1658848Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:32:37.1659583Z warnings.warn( 2022-09-27T16:32:38.0999710Z ok (4.212s) 2022-09-27T16:32:38.1017661Z test_communication_nested_model_True_use_no_sync_True_sharding_strategy_None (__main__.TestCommunication) 2022-09-27T16:32:38.1032120Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63049 2022-09-27T16:32:38.1038531Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63050 2022-09-27T16:32:39.8026540Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:39.8027236Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:39.8029400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:39.8029886Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:39.8213163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:39.8213673Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:39.8217314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:39.8217848Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:40.0668114Z dist init r=0, world=2 2022-09-27T16:32:40.0672031Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:32:40.0745439Z dist init r=1, world=2 2022-09-27T16:32:40.0750478Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:32:40.0751971Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:40.0774744Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:41.4408735Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:32:41.4409292Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:32:41.4617556Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:32:41.4618338Z warnings.warn( 2022-09-27T16:32:41.4650533Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:32:41.4651300Z warnings.warn( 2022-09-27T16:32:42.4124405Z ok (4.312s) 2022-09-27T16:32:42.4142806Z test_communication_nested_model_True_use_no_sync_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunication) 2022-09-27T16:32:42.4157484Z Tests FSDP's communication cost in terms of calls to collective ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63134 2022-09-27T16:32:42.4163920Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63135 2022-09-27T16:32:44.0641421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:44.0641940Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:44.0644563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:44.0645290Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:44.0828534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:44.0828998Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:44.0833344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:44.0833812Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:44.3239892Z dist init r=1, world=2 2022-09-27T16:32:44.3243998Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:32:44.3309268Z dist init r=0, world=2 2022-09-27T16:32:44.3315035Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:32:44.3315802Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:44.3346851Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:45.7267569Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:32:45.7268105Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:32:45.7495695Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:32:45.7496484Z warnings.warn( 2022-09-27T16:32:45.7497594Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:32:45.7498341Z warnings.warn( 2022-09-27T16:32:46.7249979Z ok (4.312s) 2022-09-27T16:32:46.7250204Z 2022-09-27T16:32:46.7250603Z ---------------------------------------------------------------------- 2022-09-27T16:32:46.7250926Z Ran 8 tests in 37.396s 2022-09-27T16:32:46.7251100Z 2022-09-27T16:32:46.7253298Z OK 2022-09-27T16:32:46.7253754Z 2022-09-27T16:32:46.7253957Z Generating XML reports... 2022-09-27T16:32:46.7311512Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_comm/TEST-TestCommunication-20220927163209.xml 2022-09-27T16:32:47.0909635Z Running distributed/fsdp/test_fsdp_exec_order ... [2022-09-27 16:32:47.090438] 2022-09-27T16:32:47.0911006Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_exec_order.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:32:47.090522] 2022-09-27T16:32:48.9882951Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_exec_order 2022-09-27T16:32:48.9899857Z 2022-09-27T16:32:48.9900123Z Running tests... 2022-09-27T16:32:48.9900542Z ---------------------------------------------------------------------- 2022-09-27T16:32:48.9908139Z test_invalid_first_iter_order_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestFSDPExecOrder) 2022-09-27T16:32:50.5420096Z Tests that FSDP errors if the all-gather order differs across ranks ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:32:50.5606089Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63254 2022-09-27T16:32:50.5613862Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63255 2022-09-27T16:32:52.2026880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:52.2027391Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:52.2028270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:52.2028778Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:52.2259680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:52.2260143Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:52.2263021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:52.2263492Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:52.4611473Z dist init r=0, world=2 2022-09-27T16:32:52.4612330Z dist init r=1, world=2 2022-09-27T16:32:52.4616011Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:32:52.4617727Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:32:52.4619295Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:52.4719715Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:53.8114868Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:32:53.8115694Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:32:53.8349834Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:32:53.8350637Z warnings.warn( 2022-09-27T16:32:53.8382602Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:32:53.8383356Z warnings.warn( 2022-09-27T16:32:54.7703035Z ok (5.780s) 2022-09-27T16:32:54.7709774Z test_invalid_first_iter_order_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestFSDPExecOrder) 2022-09-27T16:32:54.7725868Z Tests that FSDP errors if the all-gather order differs across ranks ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63335 2022-09-27T16:32:54.7731820Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63336 2022-09-27T16:32:56.3839889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:56.3840422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:56.3841244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:56.3841717Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:56.4475496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:32:56.4476347Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:32:56.4477154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:32:56.4477641Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:32:56.6385151Z dist init r=1, world=2 2022-09-27T16:32:56.6389051Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:32:56.6892380Z dist init r=0, world=2 2022-09-27T16:32:56.6897819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:32:56.6898583Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:56.6997640Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:32:58.0519354Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:32:58.0519887Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:32:58.0746326Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:32:58.0747123Z warnings.warn( 2022-09-27T16:32:58.0748234Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:32:58.0748985Z warnings.warn( 2022-09-27T16:32:58.9819053Z ok (4.212s) 2022-09-27T16:32:58.9833256Z test_invalid_later_iter_order_sharding_strategy_ShardingStrategy_FULL_SHARD_iters_before_path_change_1 (__main__.TestFSDPExecOrder) 2022-09-27T16:32:58.9847901Z Tests that FSDP warns the user if the all-gather order changes after ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63416 2022-09-27T16:32:58.9854445Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63417 2022-09-27T16:33:00.6610986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:00.6611487Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:00.6612484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:00.6612979Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:00.7084544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:00.7085019Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:00.7087475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:00.7087952Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:00.9115905Z dist init r=0, world=2 2022-09-27T16:33:00.9120002Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:33:00.9505402Z dist init r=1, world=2 2022-09-27T16:33:00.9510702Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:33:00.9512024Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:00.9526366Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:02.3273358Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:33:02.3273869Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:33:02.3508687Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:33:02.3509494Z warnings.warn( 2022-09-27T16:33:02.3510613Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:33:02.3511568Z warnings.warn( 2022-09-27T16:33:03.2941955Z ok (4.312s) 2022-09-27T16:33:03.2957186Z test_invalid_later_iter_order_sharding_strategy_ShardingStrategy_FULL_SHARD_iters_before_path_change_3 (__main__.TestFSDPExecOrder) 2022-09-27T16:33:03.2972530Z Tests that FSDP warns the user if the all-gather order changes after ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63501 2022-09-27T16:33:03.2978627Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63502 2022-09-27T16:33:04.9749137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:04.9749687Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:04.9751059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:04.9751544Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:05.0181961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:05.0182477Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:05.0184872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:05.0185361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:05.2261501Z dist init r=0, world=2 2022-09-27T16:33:05.2265562Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:33:05.2587881Z dist init r=1, world=2 2022-09-27T16:33:05.2594208Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:33:05.2595412Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:05.2673272Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:06.6223206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:33:06.6223720Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:33:06.6426536Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:33:06.6427644Z warnings.warn( 2022-09-27T16:33:06.6428765Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:33:06.6429515Z warnings.warn( 2022-09-27T16:33:07.6065997Z ok (4.312s) 2022-09-27T16:33:07.6079753Z test_invalid_later_iter_order_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP_iters_before_path_change_1 (__main__.TestFSDPExecOrder) 2022-09-27T16:33:07.6094125Z Tests that FSDP warns the user if the all-gather order changes after ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63586 2022-09-27T16:33:07.6100401Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63587 2022-09-27T16:33:09.2477720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:09.2478226Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:09.2479785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:09.2480263Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:09.2680933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:09.2681494Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:09.2684756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:09.2685250Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:09.5033621Z dist init r=1, world=2 2022-09-27T16:33:09.5037944Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:33:09.5136116Z dist init r=0, world=2 2022-09-27T16:33:09.5141596Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:33:09.5142352Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:09.5242334Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:10.8929206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:33:10.8929756Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:33:10.9146878Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:33:10.9147758Z warnings.warn( 2022-09-27T16:33:10.9182978Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:33:10.9183854Z warnings.warn( 2022-09-27T16:33:11.8186362Z ok (4.212s) 2022-09-27T16:33:11.8199659Z test_invalid_later_iter_order_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP_iters_before_path_change_3 (__main__.TestFSDPExecOrder) 2022-09-27T16:33:11.8212876Z Tests that FSDP warns the user if the all-gather order changes after ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63671 2022-09-27T16:33:11.8218847Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63672 2022-09-27T16:33:13.5130725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:13.5131204Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:13.5132119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:13.5132609Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:13.5330249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:13.5330686Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:13.5333902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:13.5334372Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:13.7755630Z dist init r=0, world=2 2022-09-27T16:33:13.7759609Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:33:13.7840814Z dist init r=1, world=2 2022-09-27T16:33:13.7846271Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:33:13.7847505Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:13.7862261Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:15.1541530Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:33:15.1542042Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:33:15.1747076Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:33:15.1747866Z warnings.warn( 2022-09-27T16:33:15.1781279Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:33:15.1782059Z warnings.warn( 2022-09-27T16:33:16.1305645Z ok (4.312s) 2022-09-27T16:33:16.1330241Z test_train_eval_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestFSDPExecOrder) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63756 2022-09-27T16:33:16.1336573Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63757 2022-09-27T16:33:17.7760495Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:17.7761770Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:17.7762924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:17.7763864Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:17.7964308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:17.7965245Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:17.7967744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:17.7968707Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:18.0315318Z dist init r=1, world=2 2022-09-27T16:33:18.0319482Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:33:18.0417970Z dist init r=0, world=2 2022-09-27T16:33:18.0423392Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:33:18.0424200Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:18.0524550Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:19.3961145Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:33:19.3961666Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:33:19.4187852Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:33:19.4188642Z warnings.warn( 2022-09-27T16:33:19.4222372Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:33:19.4223125Z warnings.warn( 2022-09-27T16:33:20.3430460Z ok (4.212s) 2022-09-27T16:33:20.3454821Z test_train_eval_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestFSDPExecOrder) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63841 2022-09-27T16:33:20.3461183Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63842 2022-09-27T16:33:22.0148265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:22.0149002Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:22.0149620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:22.0150093Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:22.0215564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:22.0216012Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:22.0219410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:22.0219891Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:22.2678389Z dist init r=0, world=2 2022-09-27T16:33:22.2682323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:33:22.2682713Z dist init r=1, world=2 2022-09-27T16:33:22.2687062Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:33:22.2687824Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:22.2785392Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:23.6507318Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:33:23.6507840Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:33:23.6749555Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:33:23.6750358Z warnings.warn( 2022-09-27T16:33:23.6751693Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:33:23.6752444Z warnings.warn( 2022-09-27T16:33:24.6560484Z ok (4.313s) 2022-09-27T16:33:24.6560726Z 2022-09-27T16:33:24.6561146Z ---------------------------------------------------------------------- 2022-09-27T16:33:24.6561521Z Ran 8 tests in 35.666s 2022-09-27T16:33:24.6561688Z 2022-09-27T16:33:24.6561784Z OK 2022-09-27T16:33:24.6561920Z 2022-09-27T16:33:24.6562054Z Generating XML reports... 2022-09-27T16:33:24.6618425Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_exec_order/TEST-TestFSDPExecOrder-20220927163248.xml 2022-09-27T16:33:25.0232079Z Running distributed/fsdp/test_fsdp_checkpoint ... [2022-09-27 16:33:25.022726] 2022-09-27T16:33:25.0232809Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_checkpoint.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:33:25.022805] 2022-09-27T16:33:26.9167205Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_checkpoint 2022-09-27T16:33:26.9183907Z 2022-09-27T16:33:26.9184239Z Running tests... 2022-09-27T16:33:26.9184682Z ---------------------------------------------------------------------- 2022-09-27T16:33:28.4271188Z test_basic_checkpoint_end_to_end_cpu_offload_CPUOffload(offload_params=False)_offload_activations_False (__main__.TestFSDPCheckpoint) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:33:28.4451497Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63961 2022-09-27T16:33:28.4458959Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63962 2022-09-27T16:33:30.1029570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:30.1030114Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:30.1031460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:30.1031945Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:30.1472735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:30.1473483Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:30.1476502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:30.1476978Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:30.3558825Z dist init r=1, world=2 2022-09-27T16:33:30.3564013Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:33:30.3903613Z dist init r=0, world=2 2022-09-27T16:33:30.3908692Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:33:30.3909471Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:30.3971340Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:31.7774534Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:33:31.7775079Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:33:32.7547240Z ok (5.836s) 2022-09-27T16:33:32.7577187Z test_basic_checkpoint_end_to_end_cpu_offload_CPUOffload(offload_params=False)_offload_activations_True (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64046 2022-09-27T16:33:32.7583815Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64047 2022-09-27T16:33:34.3967760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:34.3968376Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:34.3969349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:34.3970115Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:34.4096559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:34.4097026Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:34.4100613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:34.4101337Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:34.6530312Z dist init r=1, world=2 2022-09-27T16:33:34.6534643Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:33:34.6557982Z dist init r=0, world=2 2022-09-27T16:33:34.6563734Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:33:34.6564606Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:34.6637940Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:36.0600277Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:33:36.0600807Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:33:37.0670193Z ok (4.312s) 2022-09-27T16:33:37.0700455Z test_basic_checkpoint_end_to_end_cpu_offload_CPUOffload(offload_params=True)_offload_activations_False (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64131 2022-09-27T16:33:37.0706869Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64132 2022-09-27T16:33:38.6608202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:38.6608949Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:38.6609551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:38.6610028Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:38.7269663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:38.7270161Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:38.7271707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:38.7272188Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:38.9163787Z dist init r=1, world=2 2022-09-27T16:33:38.9167708Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:33:38.9682263Z dist init r=0, world=2 2022-09-27T16:33:38.9687648Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:33:38.9688460Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:38.9777120Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:40.3565366Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:33:40.3565883Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:33:41.3798408Z ok (4.313s) 2022-09-27T16:33:41.3828233Z test_basic_checkpoint_end_to_end_cpu_offload_CPUOffload(offload_params=True)_offload_activations_True (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64216 2022-09-27T16:33:41.3834439Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64217 2022-09-27T16:33:43.0770793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:43.0771318Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:43.0772204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:43.0772677Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:43.0822243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:43.0822737Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:43.0825428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:43.0825913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:43.3193734Z dist init r=1, world=2 2022-09-27T16:33:43.3198054Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:33:43.3398346Z dist init r=0, world=2 2022-09-27T16:33:43.3403672Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:33:43.3405207Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:43.3504710Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:44.7198172Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:33:44.7198698Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:33:45.7923202Z ok (4.412s) 2022-09-27T16:33:45.7949148Z test_checkpoint_fsdp_wrapping_cpu_offload_CPUOffload(offload_params=False)_offload_activations_False (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64301 2022-09-27T16:33:45.7955689Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64302 2022-09-27T16:33:47.4704013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:47.4704493Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:47.4705375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:47.4705850Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:47.4977014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:47.4977461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:47.4980927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:47.4981411Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:47.7279464Z dist init r=1, world=2 2022-09-27T16:33:47.7291525Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:33:47.7438203Z dist init r=0, world=2 2022-09-27T16:33:47.7444076Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:33:47.7444849Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:47.7496621Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:49.1316745Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:33:49.1317276Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:33:50.1043465Z ok (4.312s) 2022-09-27T16:33:50.1069103Z test_checkpoint_fsdp_wrapping_cpu_offload_CPUOffload(offload_params=False)_offload_activations_True (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64386 2022-09-27T16:33:50.1075288Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64387 2022-09-27T16:33:51.7935862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:51.7936365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:51.7937232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:51.7937688Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:51.8073710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:51.8074388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:51.8077627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:51.8078091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:52.0494772Z dist init r=1, world=2 2022-09-27T16:33:52.0499590Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:33:52.0528188Z dist init r=0, world=2 2022-09-27T16:33:52.0534472Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:33:52.0535538Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:52.0602243Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:53.4342861Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:33:53.4343391Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:33:54.4164103Z ok (4.312s) 2022-09-27T16:33:54.4190100Z test_checkpoint_fsdp_wrapping_cpu_offload_CPUOffload(offload_params=True)_offload_activations_False (__main__.TestFSDPCheckpoint) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64471 2022-09-27T16:33:54.4196336Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64472 2022-09-27T16:33:56.0296147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:56.0296648Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:56.0298030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:56.0298498Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:56.0327187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:33:56.0327645Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:33:56.0331467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:33:56.0331925Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:33:56.2814272Z dist init r=1, world=2 2022-09-27T16:33:56.2819525Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:33:56.2848001Z dist init r=0, world=2 2022-09-27T16:33:56.2853704Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:33:56.2854529Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:56.2922780Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:33:57.6481760Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:33:57.6482736Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:33:58.6294690Z ok (4.213s) 2022-09-27T16:33:58.6311626Z test_checkpoint_fsdp_wrapping_cpu_offload_CPUOffload(offload_params=True)_offload_activations_True (__main__.TestFSDPCheckpoint) ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/71349 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.002s) 2022-09-27T16:33:58.6312474Z 2022-09-27T16:33:58.6314146Z ---------------------------------------------------------------------- 2022-09-27T16:33:58.6314567Z Ran 8 tests in 31.713s 2022-09-27T16:33:58.6314745Z 2022-09-27T16:33:58.6314861Z OK (skipped=1) 2022-09-27T16:33:58.6315018Z 2022-09-27T16:33:58.6315146Z Generating XML reports... 2022-09-27T16:33:58.6368925Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_checkpoint/TEST-TestFSDPCheckpoint-20220927163326.xml 2022-09-27T16:33:59.0147282Z Running distributed/fsdp/test_fsdp_meta ... [2022-09-27 16:33:59.014249] 2022-09-27T16:33:59.0147986Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_meta.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:33:59.014326] 2022-09-27T16:34:00.8679723Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_meta 2022-09-27T16:34:00.8697604Z 2022-09-27T16:34:00.8697974Z Running tests... 2022-09-27T16:34:00.8698433Z ---------------------------------------------------------------------- 2022-09-27T16:34:02.4313808Z test_bad_arg_meta (__main__.TestFSDPWithMetaDevice) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:34:02.4503638Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64591 2022-09-27T16:34:02.4509697Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64592 2022-09-27T16:34:04.1279978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:04.1280469Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:04.1283291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:04.1283767Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:04.1539657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:04.1540125Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:04.1544228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:04.1544711Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:04.3728369Z dist init r=1, world=2 2022-09-27T16:34:04.3732316Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:34:04.3756226Z dist init r=0, world=2 2022-09-27T16:34:04.3760980Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:34:04.3761748Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:34:04.3835194Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:34:05.7583204Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:34:05.7583743Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:34:06.2587365Z ok (5.389s) 2022-09-27T16:34:06.2591256Z test_bad_arg_torchdistx (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.000s) 2022-09-27T16:34:06.2610695Z test_nested_model_with_meta_device_default_init_auto_wrap_False (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64672 2022-09-27T16:34:06.2616709Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64673 2022-09-27T16:34:07.9002459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:07.9002977Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:07.9005887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:07.9006372Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:07.9098590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:07.9099050Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:07.9103212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:07.9103662Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:08.1283628Z dist init r=1, world=2 2022-09-27T16:34:08.1287578Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:34:08.1329211Z dist init r=0, world=2 2022-09-27T16:34:08.1334293Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:34:08.1335653Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:34:08.1390598Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:34:09.5031601Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:34:09.5032165Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:34:10.3699704Z ok (4.111s) 2022-09-27T16:34:10.3716702Z test_nested_model_with_meta_device_default_init_auto_wrap_True (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64757 2022-09-27T16:34:10.3722680Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64758 2022-09-27T16:34:12.0425226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:12.0425747Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:12.0428366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:12.0428852Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:12.0647412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:12.0647871Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:12.0652719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:12.0653192Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:12.2787989Z dist init r=0, world=2 2022-09-27T16:34:12.2792469Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:34:12.2897509Z dist init r=1, world=2 2022-09-27T16:34:12.2902807Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:34:12.2903853Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:34:12.2996089Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:34:13.6813213Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:34:13.6813743Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:34:14.5808865Z ok (4.211s) 2022-09-27T16:34:14.5826646Z test_nested_model_with_meta_device_reset_params_auto_wrap_False (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64842 2022-09-27T16:34:14.5833684Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64843 2022-09-27T16:34:16.2312323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:16.2312838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:16.2315299Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:16.2315787Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:16.2426195Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:16.2426655Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:16.2430981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:16.2431820Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:16.4713883Z dist init r=1, world=2 2022-09-27T16:34:16.4715815Z dist init r=0, world=2 2022-09-27T16:34:16.4719307Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:34:16.4721444Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:34:16.4722569Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:34:16.4822711Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:34:17.8804859Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:34:17.8805417Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:34:18.7919130Z ok (4.211s) 2022-09-27T16:34:18.7938072Z test_nested_model_with_meta_device_reset_params_auto_wrap_True (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64927 2022-09-27T16:34:18.7944426Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64928 2022-09-27T16:34:20.3822269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:20.3822786Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:20.3825300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:20.3825785Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:20.4410518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:20.4411457Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:20.4413741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:20.4414577Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:20.6155287Z dist init r=1, world=2 2022-09-27T16:34:20.6159710Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:34:20.6610395Z dist init r=0, world=2 2022-09-27T16:34:20.6615707Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:34:20.6616739Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:34:20.6668238Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:34:22.0619285Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:34:22.0620116Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:34:23.0037184Z ok (4.212s) 2022-09-27T16:34:23.0041565Z test_nested_model_with_torchdistX_default_init_auto_wrap_False (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.001s) 2022-09-27T16:34:23.0046758Z test_nested_model_with_torchdistX_default_init_auto_wrap_True (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.000s) 2022-09-27T16:34:23.0052331Z test_nested_model_with_torchdistX_init_fn_auto_wrap_False (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.000s) 2022-09-27T16:34:23.0058307Z test_nested_model_with_torchdistX_init_fn_auto_wrap_True (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.001s) 2022-09-27T16:34:23.0075434Z test_simple_model_with_meta_device_default_init (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65012 2022-09-27T16:34:23.0081681Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65013 2022-09-27T16:34:24.6757070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:24.6757707Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:24.6760435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:24.6760916Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:24.6875147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:24.6875631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:24.6879854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:24.6880323Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:24.9085409Z dist init r=0, world=2 2022-09-27T16:34:24.9089344Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:34:24.9140040Z dist init r=1, world=2 2022-09-27T16:34:24.9145280Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:34:24.9146190Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:34:24.9192442Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:34:26.2874417Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:34:26.2874945Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:34:27.2164221Z ok (4.210s) 2022-09-27T16:34:27.2181657Z test_simple_model_with_meta_device_reset_params (__main__.TestFSDPWithMetaDevice) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65097 2022-09-27T16:34:27.2188364Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65098 2022-09-27T16:34:28.8674506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:28.8675426Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:28.8678269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:28.8679240Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:28.8831677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:28.8832183Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:28.8836248Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:28.8836747Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:29.1104268Z dist init r=1, world=2 2022-09-27T16:34:29.1107858Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:34:29.1165990Z dist init r=0, world=2 2022-09-27T16:34:29.1171716Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:34:29.1172898Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:34:29.1210805Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:34:30.5246259Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:34:30.5246789Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:34:31.4272972Z ok (4.211s) 2022-09-27T16:34:31.4278867Z test_simple_model_with_torchdistX_default_init (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.001s) 2022-09-27T16:34:31.4284946Z test_simple_model_with_torchdistX_init_fn (__main__.TestFSDPWithMetaDevice) ... skip: Test requires torchdistX: https://github.com/pytorch/torchdistX (0.001s) 2022-09-27T16:34:31.4285675Z 2022-09-27T16:34:31.4286315Z ---------------------------------------------------------------------- 2022-09-27T16:34:31.4286960Z Ran 14 tests in 30.559s 2022-09-27T16:34:31.4288619Z 2022-09-27T16:34:31.4289049Z OK (skipped=7) 2022-09-27T16:34:31.4289248Z 2022-09-27T16:34:31.4289387Z Generating XML reports... 2022-09-27T16:34:31.4349819Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_meta/TEST-TestFSDPWithMetaDevice-20220927163400.xml 2022-09-27T16:34:31.8075773Z Running distributed/_shard/sharded_tensor/ops/test_matrix_ops ... [2022-09-27 16:34:31.807100] 2022-09-27T16:34:31.8076615Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_matrix_ops.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:34:31.807173] 2022-09-27T16:34:33.6855748Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_matrix_ops 2022-09-27T16:34:33.6874881Z 2022-09-27T16:34:33.6875198Z Running tests... 2022-09-27T16:34:33.6875719Z ---------------------------------------------------------------------- 2022-09-27T16:34:35.1858577Z test_sharded_tensor_contiguous (__main__.TestShardedTensorMatrixOps) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:34:35.2493277Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65217 2022-09-27T16:34:35.2498249Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65218 2022-09-27T16:34:35.2504335Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 65219 2022-09-27T16:34:35.2510657Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 65220 2022-09-27T16:34:36.8638447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:36.8638987Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:36.8639792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:36.8640300Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:36.8869647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:36.8870384Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:36.8871897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:36.8872368Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:36.8876962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:36.8877421Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:36.8879931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:36.8880400Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:36.9458125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:36.9458618Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:36.9460372Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:36.9460853Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:37.1106276Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:34:37.1197585Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:34:37.1216902Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:34:37.1759638Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:34:37.5568590Z skip: Need at least 4 CUDA devices (3.869s) 2022-09-27T16:34:37.5595555Z test_sharded_tensor_layer_norm (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65353 2022-09-27T16:34:37.5602063Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65354 2022-09-27T16:34:37.5608586Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 65355 2022-09-27T16:34:37.5615417Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 65356 2022-09-27T16:34:39.1684240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:39.1684770Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:39.1686392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:39.1686875Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:39.1754075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:39.1754557Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:39.1757365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:39.1757845Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:39.2047548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:39.2048012Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:39.2053125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:39.2053602Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:39.2737681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:39.2738407Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:39.2739718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:39.2740213Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:39.4064210Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:34:39.4086366Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:34:39.4484351Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:34:39.4988900Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:34:39.8670164Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:34:39.8692313Z test_sharded_tensor_layer_norm_error (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65489 2022-09-27T16:34:39.8698132Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65490 2022-09-27T16:34:39.8704547Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 65491 2022-09-27T16:34:39.8710941Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 65492 2022-09-27T16:34:41.4812796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:41.4813308Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:41.4814777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:41.4815256Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:41.4857101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:41.4857563Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:41.4860845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:41.4861329Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:41.5114982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:41.5115448Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:41.5117527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:41.5118008Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:41.5280180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:41.5280663Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:41.5283454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:41.5283931Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:41.7133270Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:34:41.7327287Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:34:41.7459420Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:34:41.7478285Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:34:42.1774909Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:34:42.1793810Z test_sharded_tensor_masked_fill (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65625 2022-09-27T16:34:42.1800756Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65626 2022-09-27T16:34:42.1807708Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 65627 2022-09-27T16:34:42.1814200Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 65628 2022-09-27T16:34:43.9115482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:43.9115997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:43.9117322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:43.9117875Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:43.9337338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:43.9338131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:43.9339532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:43.9340079Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:43.9400374Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:43.9401097Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:43.9403870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:43.9404594Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:43.9601865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:43.9602475Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:43.9605563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:43.9606159Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:44.1617247Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:34:44.1697726Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:34:44.1724267Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:34:44.1883506Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:34:44.5874370Z skip: Need at least 4 CUDA devices (2.410s) 2022-09-27T16:34:44.5897251Z test_sharded_tensor_masked_fill_error (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65761 2022-09-27T16:34:44.5905906Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65762 2022-09-27T16:34:44.5914397Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 65763 2022-09-27T16:34:44.5922638Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 65764 2022-09-27T16:34:46.2105974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:46.2106488Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:46.2107069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:46.2107500Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:46.2108086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:46.2108574Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:46.2109396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:46.2109870Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:46.2393501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:46.2393965Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:46.2395139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:46.2395590Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:46.2879537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:46.2880307Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:46.2881133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:46.2881590Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:46.4648252Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:34:46.4677131Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:34:46.4691166Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:34:46.5130529Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:34:46.8981445Z skip: Need at least 4 CUDA devices (2.311s) 2022-09-27T16:34:46.9001748Z test_sharded_tensor_softmax (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65897 2022-09-27T16:34:46.9008473Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65898 2022-09-27T16:34:46.9014951Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 65899 2022-09-27T16:34:46.9021917Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 65900 2022-09-27T16:34:48.5044277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:48.5044759Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:48.5045664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:48.5046138Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:48.5493969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:48.5494455Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:48.5495496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:48.5495977Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:48.5628844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:48.5629283Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:48.5632786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:48.5633261Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:48.6301800Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:48.6302426Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:48.6303840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:48.6304592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:48.7392798Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:34:48.7700814Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:34:48.7844078Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:34:48.8563443Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:34:49.3084618Z skip: Need at least 4 CUDA devices (2.410s) 2022-09-27T16:34:49.3111070Z test_sharded_tensor_transpose (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66033 2022-09-27T16:34:49.3119538Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66034 2022-09-27T16:34:49.3126333Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 66035 2022-09-27T16:34:49.3133234Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 66036 2022-09-27T16:34:50.9258712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:50.9259201Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:50.9260918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:50.9261383Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:50.9424864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:50.9425334Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:50.9428081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:50.9428557Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:50.9571618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:50.9572078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:50.9575175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:50.9575632Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:50.9791330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:50.9792024Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:50.9794914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:50.9795393Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:51.1791941Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:34:51.1822025Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:34:51.1830466Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:34:51.2011532Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:34:51.6188774Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:34:51.6208341Z test_sharded_tensor_transpose_error (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66169 2022-09-27T16:34:51.6214910Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66170 2022-09-27T16:34:51.6221673Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 66171 2022-09-27T16:34:51.6228362Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 66172 2022-09-27T16:34:53.2594545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:53.2595053Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:53.2596097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:53.2596566Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:53.2679048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:53.2679505Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:53.2682398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:53.2682858Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:53.3402699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:53.3403189Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:53.3403784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:53.3404241Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:53.3565942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:53.3566408Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:53.3566987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:53.3567453Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:53.4927242Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:34:53.5002735Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:34:53.5625223Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:34:53.5852653Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:34:54.0285563Z skip: Need at least 4 CUDA devices (2.410s) 2022-09-27T16:34:54.0309337Z test_sharded_tensor_type_as (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66305 2022-09-27T16:34:54.0316143Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66306 2022-09-27T16:34:54.0323308Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 66307 2022-09-27T16:34:54.0329784Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 66308 2022-09-27T16:34:55.6416108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:55.6416647Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:55.6417793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:55.6418266Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:55.6437035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:55.6437627Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:55.6440705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:55.6441192Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:55.6599622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:55.6600113Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:55.6603894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:55.6604367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:55.6712521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:55.6712974Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:55.6715760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:55.6716375Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:55.8711131Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:34:55.8949589Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:34:55.9025860Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:34:55.9032987Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:34:56.3385502Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:34:56.3408983Z test_sharded_tensor_view (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66441 2022-09-27T16:34:56.3415509Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66442 2022-09-27T16:34:56.3422271Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 66443 2022-09-27T16:34:56.3428891Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 66444 2022-09-27T16:34:57.9567583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:57.9568553Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:57.9569740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:57.9570698Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:57.9668374Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:57.9669292Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:57.9671468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:57.9672463Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:57.9699332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:57.9700234Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:57.9702373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:57.9703343Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:57.9795531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:34:57.9796339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:34:57.9798074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:34:57.9799029Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:34:58.2013561Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:34:58.2056000Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:34:58.2063954Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:34:58.2127776Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:34:58.6485233Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:34:58.6506098Z test_sharded_tensor_view_error (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66577 2022-09-27T16:34:58.6512679Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66578 2022-09-27T16:34:58.6519241Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 66579 2022-09-27T16:34:58.6525773Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 66580 2022-09-27T16:35:00.2464000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:00.2464527Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:00.2465120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:00.2465573Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:00.2562476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:00.2562944Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:00.2565457Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:00.2565921Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:00.2668209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:00.2668676Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:00.2672110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:00.2672568Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:00.2906280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:00.2906750Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:00.2909816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:00.2910272Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:00.4864719Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:35:00.4982339Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:35:00.5011930Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:35:00.5192963Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:35:00.9583469Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:35:00.9583729Z 2022-09-27T16:35:00.9584122Z ---------------------------------------------------------------------- 2022-09-27T16:35:00.9584455Z Ran 11 tests in 27.271s 2022-09-27T16:35:00.9584617Z 2022-09-27T16:35:00.9584732Z OK (skipped=11) 2022-09-27T16:35:00.9586721Z 2022-09-27T16:35:00.9587064Z Generating XML reports... 2022-09-27T16:35:00.9635265Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_matrix_ops/TEST-TestShardedTensorMatrixOps-20220927163433.xml 2022-09-27T16:35:01.3280721Z Running distributed/fsdp/test_fsdp_ignored_modules ... [2022-09-27 16:35:01.327558] 2022-09-27T16:35:01.3281776Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_ignored_modules.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:35:01.327635] 2022-09-27T16:35:03.2294440Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_ignored_modules 2022-09-27T16:35:03.2310682Z 2022-09-27T16:35:03.2311065Z Running tests... 2022-09-27T16:35:03.2311767Z ---------------------------------------------------------------------- 2022-09-27T16:35:03.2321169Z test_diff_ignored_modules_across_ranks_pass_ignored_modules_to_root_False (__main__.TestFSDPIgnoredModules) 2022-09-27T16:35:04.8012810Z Tests ignoring different modules across ranks. ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:35:04.8198700Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66748 2022-09-27T16:35:04.8206204Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66749 2022-09-27T16:35:06.4353576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:06.4354069Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:06.4354881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:06.4355363Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:06.4540022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:06.4540470Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:06.4543762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:06.4544251Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:06.6978904Z dist init r=1, world=2 2022-09-27T16:35:06.6983377Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:35:06.7060462Z dist init r=0, world=2 2022-09-27T16:35:06.7065743Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:35:06.7066821Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:06.7086132Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:08.0994316Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:35:08.0994847Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:35:09.0292657Z ok (5.798s) 2022-09-27T16:35:09.0301036Z test_diff_ignored_modules_across_ranks_pass_ignored_modules_to_root_True (__main__.TestFSDPIgnoredModules) 2022-09-27T16:35:09.0314970Z Tests ignoring different modules across ranks. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66833 2022-09-27T16:35:09.0321015Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66834 2022-09-27T16:35:10.6656685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:10.6657214Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:10.6658491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:10.6658977Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:10.6777880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:10.6778367Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:10.6781797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:10.6782302Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:10.9224866Z dist init r=0, world=2 2022-09-27T16:35:10.9228847Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:35:10.9239307Z dist init r=1, world=2 2022-09-27T16:35:10.9245157Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:35:10.9246363Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:10.9331707Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:12.2830464Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:35:12.2831508Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:35:13.2404538Z ok (4.211s) 2022-09-27T16:35:13.2409737Z test_ignored_modules_invalid (__main__.TestFSDPIgnoredModules) 2022-09-27T16:35:13.2423314Z Tests that passing an FSDP module as an ignored module or the ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66918 2022-09-27T16:35:13.2429535Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66919 2022-09-27T16:35:14.8638914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:14.8639405Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:14.8640646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:14.8641138Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:14.9305831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:14.9306295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:14.9307880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:14.9308350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:15.1121845Z dist init r=1, world=2 2022-09-27T16:35:15.1125880Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:35:15.1705624Z dist init r=0, world=2 2022-09-27T16:35:15.1711231Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:35:15.1712305Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:15.1734709Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:16.5525457Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:35:16.5525993Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:35:17.0506605Z ok (3.810s) 2022-09-27T16:35:17.0515144Z test_ignored_modules_nested (__main__.TestFSDPIgnoredModules) 2022-09-27T16:35:17.0528720Z Tests that passing a module with nested FSDP modules does not ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66999 2022-09-27T16:35:17.0535085Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67000 2022-09-27T16:35:18.6867674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:18.6868180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:18.6869636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:18.6870139Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:18.7178363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:18.7178816Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:18.7181906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:18.7182390Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:18.9619555Z dist init r=0, world=2 2022-09-27T16:35:18.9624226Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:35:18.9709320Z dist init r=1, world=2 2022-09-27T16:35:18.9755306Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:35:18.9756118Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:18.9828965Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:20.3695757Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:35:20.3696315Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:35:21.2620863Z ok (4.211s) 2022-09-27T16:35:21.2630812Z test_ignored_modules_transformer (__main__.TestFSDPIgnoredModules) 2022-09-27T16:35:21.2644139Z Tests that ignored modules' parameters are not flattened for a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67084 2022-09-27T16:35:21.2649791Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67085 2022-09-27T16:35:22.9527634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:22.9528131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:22.9529052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:22.9529525Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:22.9942577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:22.9943048Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:22.9946016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:22.9946495Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:23.2076006Z dist init r=1, world=2 2022-09-27T16:35:23.2079788Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:35:23.2361091Z dist init r=0, world=2 2022-09-27T16:35:23.2366252Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:35:23.2367056Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:23.2384964Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:24.6158571Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:35:24.6159095Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:35:25.8740309Z ok (4.612s) 2022-09-27T16:35:25.8740530Z 2022-09-27T16:35:25.8742753Z ---------------------------------------------------------------------- 2022-09-27T16:35:25.8743470Z Ran 5 tests in 22.643s 2022-09-27T16:35:25.8743671Z 2022-09-27T16:35:25.8743766Z OK 2022-09-27T16:35:25.8743900Z 2022-09-27T16:35:25.8744016Z Generating XML reports... 2022-09-27T16:35:25.8792964Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_ignored_modules/TEST-TestFSDPIgnoredModules-20220927163503.xml 2022-09-27T16:35:26.2403511Z Running distributed/_shard/checkpoint/test_file_system_checkpoint ... [2022-09-27 16:35:26.239837] 2022-09-27T16:35:26.2404336Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/checkpoint/test_file_system_checkpoint.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:35:26.239912] 2022-09-27T16:35:28.1407743Z Test results will be stored in test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint 2022-09-27T16:35:28.1426835Z 2022-09-27T16:35:28.1427103Z Running tests... 2022-09-27T16:35:28.1427541Z ---------------------------------------------------------------------- 2022-09-27T16:35:29.6879358Z test_load_rowwise_to_colwise (__main__.TestDistributedReshardOnLoad) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:35:29.7059781Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67204 2022-09-27T16:35:29.7066715Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67205 2022-09-27T16:35:31.3160809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:31.3161801Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:31.3163007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:31.3163925Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:31.3364084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:31.3365014Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:31.3366816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:31.3367761Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:31.5667668Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:35:31.5775962Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:35:31.5866019Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:35:31.5975993Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:35:31.5976804Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:31.6070468Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:33.5146905Z ok (5.372s) 2022-09-27T16:35:33.5190820Z test_load_with_different_shard_plan (__main__.TestDistributedReshardOnLoad) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67289 2022-09-27T16:35:33.5198242Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67290 2022-09-27T16:35:35.1121807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:35.1122820Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:35.1123991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:35.1124909Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:35.1597101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:35.1598102Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:35.1599968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:35.1600929Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:35.3528725Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:35:35.3725745Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:35:35.3962201Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:35:35.4166053Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:35:35.4167761Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:35.4234307Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:37.8290331Z ok (4.314s) 2022-09-27T16:35:37.8310636Z test_save_load_bytes (__main__.TestDistributedReshardOnLoad) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67374 2022-09-27T16:35:37.8317204Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67375 2022-09-27T16:35:39.4739892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:39.4740418Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:39.4740994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:39.4741493Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:39.5021905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:39.5022381Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:39.5024652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:39.5025136Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:39.7299545Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:35:39.7388647Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:35:39.7499622Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:35:39.7592294Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:35:39.7593160Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:39.7602326Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:41.6395625Z ok (3.811s) 2022-09-27T16:35:41.6426669Z test_switch_between_sharded_tensor_to_tensor (__main__.TestDistributedReshardOnLoad) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67459 2022-09-27T16:35:41.6433093Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67460 2022-09-27T16:35:43.2787162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:43.2787956Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:43.2788564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:43.2789261Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:43.3121839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:43.3122304Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:43.3125684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:43.3126170Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:43.5271890Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:35:43.5469800Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:35:43.5487322Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:35:43.5687169Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:35:43.5688150Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:43.5776168Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:46.0530200Z ok (4.413s) 2022-09-27T16:35:46.1029554Z test_read_write_only_tensor (__main__.TestDistributedStateDictSaveLoad) ... ok (0.050s) 2022-09-27T16:35:46.1053340Z test_read_write_shard_tensor (__main__.TestDistributedStateDictSaveLoadWithSharedTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67544 2022-09-27T16:35:46.1059682Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67545 2022-09-27T16:35:47.7391131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:47.7391685Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:47.7393032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:47.7393576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:47.7446540Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:47.7446998Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:47.7449231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:47.7449703Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:47.9763179Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:35:47.9844896Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:35:47.9962905Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:35:48.0042565Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:35:48.0043321Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:48.0065597Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:49.9138159Z ok (3.811s) 2022-09-27T16:35:49.9138541Z 2022-09-27T16:35:49.9139108Z ---------------------------------------------------------------------- 2022-09-27T16:35:49.9139448Z Ran 6 tests in 21.771s 2022-09-27T16:35:49.9139618Z 2022-09-27T16:35:49.9139713Z OK 2022-09-27T16:35:49.9142721Z 2022-09-27T16:35:49.9142986Z Generating XML reports... 2022-09-27T16:35:49.9181258Z Generated XML report: test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint/TEST-TestDistributedReshardOnLoad-20220927163528.xml 2022-09-27T16:35:49.9184295Z Generated XML report: test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint/TEST-TestDistributedStateDictSaveLoad-20220927163528.xml 2022-09-27T16:35:49.9187691Z Generated XML report: test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint/TEST-TestDistributedStateDictSaveLoadWithSharedTensor-20220927163528.xml 2022-09-27T16:35:50.3045976Z Running distributed/fsdp/test_fsdp_memory ... [2022-09-27 16:35:50.304123] 2022-09-27T16:35:50.3046713Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_memory.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:35:50.304200] 2022-09-27T16:35:52.2060428Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_memory 2022-09-27T16:35:52.2078694Z 2022-09-27T16:35:52.2079169Z Running tests... 2022-09-27T16:35:52.2079668Z ---------------------------------------------------------------------- 2022-09-27T16:35:53.7459953Z test_fsdp_memory_ckpt_ckpt (__main__.TestFSDPMemory) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:35:53.7639163Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67664 2022-09-27T16:35:53.7646476Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67665 2022-09-27T16:35:55.3225523Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:55.3226033Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:55.3227188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:55.3227643Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:55.4211408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:35:55.4211894Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:35:55.4213841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:35:55.4214293Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:35:55.5539245Z dist init r=0, world=2 2022-09-27T16:35:55.5542981Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:35:55.6453603Z dist init r=1, world=2 2022-09-27T16:35:55.6459346Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:35:55.6461369Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:55.6556690Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:35:57.0336389Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:35:57.0336906Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:35:57.0685966Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:35:57.0686753Z warnings.warn( 2022-09-27T16:35:57.0696744Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:35:57.0697538Z warnings.warn( 2022-09-27T16:36:01.1814264Z ok (8.973s) 2022-09-27T16:36:01.1843102Z test_fsdp_memory_ckpt_no_ckpt (__main__.TestFSDPMemory) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67779 2022-09-27T16:36:01.1849139Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67780 2022-09-27T16:36:02.8590461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:02.8591315Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:02.8595571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:02.8596353Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:02.8988162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:02.8988763Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:02.8991717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:02.8992194Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:03.0926457Z dist init r=1, world=2 2022-09-27T16:36:03.0930038Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:36:03.1239125Z dist init r=0, world=2 2022-09-27T16:36:03.1244502Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:36:03.1245653Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:36:03.1336342Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:36:04.5203403Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:36:04.5203917Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:36:04.5574975Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:36:04.5575779Z warnings.warn( 2022-09-27T16:36:04.5576906Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:36:04.5577634Z warnings.warn( 2022-09-27T16:36:08.1979025Z ok (7.016s) 2022-09-27T16:36:08.1979252Z 2022-09-27T16:36:08.1979666Z ---------------------------------------------------------------------- 2022-09-27T16:36:08.1979993Z Ran 2 tests in 15.990s 2022-09-27T16:36:08.1980157Z 2022-09-27T16:36:08.1980251Z OK 2022-09-27T16:36:08.1980388Z 2022-09-27T16:36:08.1980522Z Generating XML reports... 2022-09-27T16:36:08.2018914Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_memory/TEST-TestFSDPMemory-20220927163552.xml 2022-09-27T16:36:08.5751414Z Running distributed/_shard/sharding_plan/test_sharding_plan ... [2022-09-27 16:36:08.574600] 2022-09-27T16:36:08.5752492Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharding_plan/test_sharding_plan.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:36:08.574675] 2022-09-27T16:36:10.4330259Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharding_plan.test_sharding_plan 2022-09-27T16:36:10.4347007Z 2022-09-27T16:36:10.4347521Z Running tests... 2022-09-27T16:36:10.4348012Z ---------------------------------------------------------------------- 2022-09-27T16:36:11.9532483Z test_custom_sharding_planner (__main__.TestShardingPlan) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:36:11.9712129Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67929 2022-09-27T16:36:11.9718528Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67930 2022-09-27T16:36:11.9725025Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 67931 2022-09-27T16:36:11.9731562Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 67932 2022-09-27T16:36:13.6013963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:13.6014469Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:13.6015910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:13.6016371Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:13.6095673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:13.6096132Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:13.6099970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:13.6100435Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:13.6243813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:13.6244357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:13.6248725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:13.6249179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:13.6616437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:13.6616923Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:13.6619252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:13.6619752Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:13.8375229Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:36:13.8477940Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:36:13.8510847Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:36:13.8820852Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:36:14.2787325Z skip: Need at least 4 CUDA devices (3.844s) 2022-09-27T16:36:14.2817493Z test_reshard_to_ddp_sharding_plan (__main__.TestShardingPlan) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68065 2022-09-27T16:36:14.2823607Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68066 2022-09-27T16:36:14.2829969Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 68067 2022-09-27T16:36:14.2836847Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 68068 2022-09-27T16:36:15.8885162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:15.8885942Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:15.8887009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:15.8887503Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:15.9143000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:15.9143481Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:15.9146565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:15.9147225Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:15.9227747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:15.9228217Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:15.9231521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:15.9232003Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:15.9879981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:15.9880467Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:15.9882830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:15.9883325Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:16.1460867Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:36:16.1613636Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:36:16.1692498Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:36:16.2104888Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:36:16.5892998Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:36:16.5914343Z test_shard_module_sub_process_group (__main__.TestShardingPlan) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68201 2022-09-27T16:36:16.5920806Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68202 2022-09-27T16:36:16.5927284Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 68203 2022-09-27T16:36:16.5933836Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 68204 2022-09-27T16:36:18.2103465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:18.2103989Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:18.2105064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:18.2105541Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:18.2116485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:18.2116943Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:18.2120551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:18.2121041Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:18.2685425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:18.2685937Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:18.2688452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:18.2688933Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:18.2768485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:18.2768963Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:18.2771685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:18.2772400Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:18.4371356Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:36:18.4489614Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:36:18.4956246Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:36:18.5081884Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:36:18.8990393Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:36:18.9017773Z test_sharding_plan_errors (__main__.TestShardingPlan) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68337 2022-09-27T16:36:18.9024141Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68338 2022-09-27T16:36:18.9030578Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 68339 2022-09-27T16:36:18.9037463Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 68340 2022-09-27T16:36:20.5306289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:20.5306869Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:20.5308446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:20.5309286Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:20.5615957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:20.5616775Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:20.5620426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:20.5621189Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:20.5916612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:20.5917352Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:20.5920343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:20.5921179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:20.5995910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:20.5996475Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:20.5999758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:20.6000547Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:20.7767924Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:36:20.7912122Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:36:20.8167620Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:36:20.8190919Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:36:21.2094024Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:36:21.2145663Z test_sharding_plan_simple_megatron (__main__.TestShardingPlan) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68473 2022-09-27T16:36:21.2152726Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68474 2022-09-27T16:36:21.2159514Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 68475 2022-09-27T16:36:21.2166326Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 68476 2022-09-27T16:36:22.8724143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:22.8724705Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:22.8726104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:22.8726936Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:22.8901460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:22.8902235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:22.8905948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:22.8906743Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:22.8939481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:22.8940233Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:22.8944002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:22.8944769Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:22.9051354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:22.9052130Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:22.9055766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:22.9056569Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:23.1350523Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:36:23.1354867Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:36:23.1390788Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:36:23.1411832Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:36:23.5221685Z skip: Need at least 4 CUDA devices (2.313s) 2022-09-27T16:36:23.5222136Z 2022-09-27T16:36:23.5222910Z ---------------------------------------------------------------------- 2022-09-27T16:36:23.5223453Z Ran 5 tests in 13.087s 2022-09-27T16:36:23.5223623Z 2022-09-27T16:36:23.5223736Z OK (skipped=5) 2022-09-27T16:36:23.5223874Z 2022-09-27T16:36:23.5224002Z Generating XML reports... 2022-09-27T16:36:23.5264054Z Generated XML report: test-reports/python-unittest/distributed._shard.sharding_plan.test_sharding_plan/TEST-TestShardingPlan-20220927163610.xml 2022-09-27T16:36:23.8903271Z Running distributed/_shard/test_partial_tensor ... [2022-09-27 16:36:23.889775] 2022-09-27T16:36:23.8904325Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/test_partial_tensor.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:36:23.889851] 2022-09-27T16:36:25.7455189Z Test results will be stored in test-reports/python-unittest/distributed._shard.test_partial_tensor 2022-09-27T16:36:25.7474409Z 2022-09-27T16:36:25.7474933Z Running tests... 2022-09-27T16:36:25.7475460Z ---------------------------------------------------------------------- 2022-09-27T16:36:27.2384943Z test_cat (__main__.TestPartialTensorOps) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:36:27.3067830Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68644 2022-09-27T16:36:27.3073997Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68645 2022-09-27T16:36:27.3080368Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 68646 2022-09-27T16:36:27.3087157Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 68647 2022-09-27T16:36:28.9286907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:28.9287444Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:28.9288439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:28.9288955Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:28.9453617Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:28.9454080Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:28.9457170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:28.9457664Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:29.0063304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:29.0063811Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:29.0064421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:29.0065038Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:29.0065626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:29.0066081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:29.0066968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:29.0067462Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:29.1696617Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:36:29.1707338Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:36:29.2491992Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:36:29.2492497Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:36:29.6146325Z skip: Need at least 4 CUDA devices (3.867s) 2022-09-27T16:36:29.6168172Z test_cat_errors (__main__.TestPartialTensorOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68780 2022-09-27T16:36:29.6173999Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68781 2022-09-27T16:36:29.6180165Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 68782 2022-09-27T16:36:29.6186656Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 68783 2022-09-27T16:36:31.2554941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:31.2555641Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:31.2556962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:31.2557503Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:31.2628694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:31.2629159Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:31.2632071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:31.2632565Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:31.2633371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:31.2633828Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:31.2637397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:31.2637885Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:31.3114293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:31.3114801Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:31.3115697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:31.3116159Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:31.5007476Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:36:31.5034639Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:36:31.5044031Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:36:31.5385684Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:36:31.9240747Z skip: Need at least 4 CUDA devices (2.309s) 2022-09-27T16:36:31.9257788Z test_transpose (__main__.TestPartialTensorOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68916 2022-09-27T16:36:31.9263880Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68917 2022-09-27T16:36:31.9270032Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 68918 2022-09-27T16:36:31.9277563Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 68919 2022-09-27T16:36:33.5497670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:33.5498199Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:33.5499127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:33.5499609Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:33.5599006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:33.5599475Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:33.5602355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:33.5602846Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:33.5669467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:33.5669942Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:33.5673546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:33.5674063Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:33.6101967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:33.6102458Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:33.6103308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:33.6103781Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:33.7855607Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:36:33.7930714Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:36:33.7996268Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:36:33.8238971Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:36:34.2332587Z skip: Need at least 4 CUDA devices (2.309s) 2022-09-27T16:36:34.2361049Z test_partial_tensor_reshard (__main__.TestPartialTensorReshard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69052 2022-09-27T16:36:34.2367895Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69053 2022-09-27T16:36:34.2374860Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 69054 2022-09-27T16:36:34.2382219Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 69055 2022-09-27T16:36:35.8348874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:35.8349461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:35.8350417Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:35.8351250Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:35.8639280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:35.8639758Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:35.8642770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:35.8643242Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:35.8668797Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:35.8669248Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:35.8671902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:35.8672370Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:35.8708366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:35.8708798Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:35.8711351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:35.8712030Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:36.0917156Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:36:36.0944863Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:36:36.0998515Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:36:36.1101900Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:36:36.5442251Z skip: Need at least 4 CUDA devices (2.311s) 2022-09-27T16:36:36.5467284Z test_partial_tensor_reshard_errors (__main__.TestPartialTensorReshard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69188 2022-09-27T16:36:36.5474709Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69189 2022-09-27T16:36:36.5480978Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 69190 2022-09-27T16:36:36.5487917Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 69191 2022-09-27T16:36:38.1677461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:38.1678270Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:38.1678871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:38.1679348Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:38.1879908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:38.1880386Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:38.1883230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:38.1883706Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:38.1911242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:38.1911904Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:38.1915633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:38.1916112Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:38.2094586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:38.2095027Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:38.2098030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:38.2098500Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:38.4191555Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:36:38.4292304Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:36:38.4334165Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:36:38.4350329Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:36:38.8543246Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:36:38.8543504Z 2022-09-27T16:36:38.8543892Z ---------------------------------------------------------------------- 2022-09-27T16:36:38.8544236Z Ran 5 tests in 13.107s 2022-09-27T16:36:38.8544412Z 2022-09-27T16:36:38.8544524Z OK (skipped=5) 2022-09-27T16:36:38.8544659Z 2022-09-27T16:36:38.8544787Z Generating XML reports... 2022-09-27T16:36:38.8584002Z Generated XML report: test-reports/python-unittest/distributed._shard.test_partial_tensor/TEST-TestPartialTensorOps-20220927163625.xml 2022-09-27T16:36:38.8589277Z Generated XML report: test-reports/python-unittest/distributed._shard.test_partial_tensor/TEST-TestPartialTensorReshard-20220927163625.xml 2022-09-27T16:36:39.2179937Z Running distributed/fsdp/test_fsdp_apply ... [2022-09-27 16:36:39.217492] 2022-09-27T16:36:39.2180955Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_apply.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:36:39.217566] 2022-09-27T16:36:41.0626773Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_apply 2022-09-27T16:36:41.0643469Z 2022-09-27T16:36:41.0643882Z Running tests... 2022-09-27T16:36:41.0644363Z ---------------------------------------------------------------------- 2022-09-27T16:36:41.0649140Z test_apply_in_summon_raises_error (__main__.TestApply) 2022-09-27T16:36:42.5896211Z Tests that calling ``apply()`` on an FSDP instance inside the ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:36:42.6076334Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69359 2022-09-27T16:36:42.6083064Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69360 2022-09-27T16:36:44.2470018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:44.2470545Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:44.2472020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:44.2472509Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:44.2615546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:44.2616011Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:44.2618595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:44.2619081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:44.4844505Z dist init r=0, world=2 2022-09-27T16:36:44.4848869Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:36:44.4899183Z dist init r=1, world=2 2022-09-27T16:36:44.4904497Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:36:44.4905484Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:36:44.4951977Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:36:45.8963666Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:36:45.8964178Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:36:45.9282219Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:36:45.9283017Z warnings.warn( 2022-09-27T16:36:45.9314561Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:36:45.9315319Z warnings.warn( 2022-09-27T16:36:45.9461931Z Asserting FSDP instance is: FullyShardedDataParallel( 2022-09-27T16:36:45.9462308Z (_fsdp_wrapped_module): FlattenParamsWrapper( 2022-09-27T16:36:45.9462678Z (_fpw_module): TransformerWithSharedParams( 2022-09-27T16:36:45.9463006Z (embed_tokens): Embedding(23, 16) 2022-09-27T16:36:45.9464677Z (transformer): Transformer( 2022-09-27T16:36:45.9465961Z (encoder): TransformerEncoder( 2022-09-27T16:36:45.9466283Z (layers): ModuleList( 2022-09-27T16:36:45.9466598Z (0): FullyShardedDataParallel( 2022-09-27T16:36:45.9467267Z (_fsdp_wrapped_module): FlattenParamsWrapper( 2022-09-27T16:36:45.9467891Z (_fpw_module): TransformerEncoderLayer( 2022-09-27T16:36:45.9468471Z (self_attn): MultiheadAttention( 2022-09-27T16:36:45.9469139Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2022-09-27T16:36:45.9469512Z ) 2022-09-27T16:36:45.9469802Z (linear1): Linear(in_features=16, out_features=8, bias=True) 2022-09-27T16:36:45.9470182Z (dropout): Dropout(p=0.1, inplace=False) 2022-09-27T16:36:45.9471005Z (linear2): Linear(in_features=8, out_features=16, bias=True) 2022-09-27T16:36:45.9471885Z (norm1): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-09-27T16:36:45.9472523Z (norm2): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-09-27T16:36:45.9473023Z (dropout1): Dropout(p=0.1, inplace=False) 2022-09-27T16:36:45.9473779Z (dropout2): Dropout(p=0.1, inplace=False) 2022-09-27T16:36:45.9474363Z ) 2022-09-27T16:36:45.9474838Z ) 2022-09-27T16:36:45.9475064Z ) 2022-09-27T16:36:45.9475343Z (1): FullyShardedDataParallel( 2022-09-27T16:36:45.9475677Z (_fsdp_wrapped_module): FlattenParamsWrapper( 2022-09-27T16:36:45.9476040Z (_fpw_module): TransformerEncoderLayer( 2022-09-27T16:36:45.9476375Z (self_attn): MultiheadAttention( 2022-09-27T16:36:45.9476793Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2022-09-27T16:36:45.9477137Z ) 2022-09-27T16:36:45.9477453Z (linear1): Linear(in_features=16, out_features=8, bias=True) 2022-09-27T16:36:45.9477807Z (dropout): Dropout(p=0.1, inplace=False) 2022-09-27T16:36:45.9478141Z (linear2): Linear(in_features=8, out_features=16, bias=True) 2022-09-27T16:36:45.9478617Z (norm1): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-09-27T16:36:45.9479064Z (norm2): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-09-27T16:36:45.9479408Z (dropout1): Dropout(p=0.1, inplace=False) 2022-09-27T16:36:45.9479743Z (dropout2): Dropout(p=0.1, inplace=False) 2022-09-27T16:36:45.9480012Z ) 2022-09-27T16:36:45.9480245Z ) 2022-09-27T16:36:45.9480449Z ) 2022-09-27T16:36:45.9480663Z ) 2022-09-27T16:36:45.9481044Z (norm): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-09-27T16:36:45.9481322Z ) 2022-09-27T16:36:45.9481583Z (decoder): TransformerDecoder( 2022-09-27T16:36:45.9481872Z (layers): ModuleList( 2022-09-27T16:36:45.9482150Z (0): FullyShardedDataParallel( 2022-09-27T16:36:45.9482494Z (_fsdp_wrapped_module): FlattenParamsWrapper( 2022-09-27T16:36:45.9482857Z (_fpw_module): TransformerDecoderLayer( 2022-09-27T16:36:45.9483173Z (self_attn): MultiheadAttention( 2022-09-27T16:36:45.9483591Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2022-09-27T16:36:45.9483944Z ) 2022-09-27T16:36:45.9484214Z (multihead_attn): MultiheadAttention( 2022-09-27T16:36:45.9484632Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2022-09-27T16:36:45.9484985Z ) 2022-09-27T16:36:45.9485292Z (linear1): Linear(in_features=16, out_features=8, bias=True) 2022-09-27T16:36:45.9485626Z (dropout): Dropout(p=0.1, inplace=False) 2022-09-27T16:36:45.9486101Z (linear2): Linear(in_features=8, out_features=16, bias=True) 2022-09-27T16:36:45.9486575Z (norm1): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-09-27T16:36:45.9487020Z (norm2): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-09-27T16:36:45.9487458Z (norm3): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-09-27T16:36:45.9487821Z (dropout1): Dropout(p=0.1, inplace=False) 2022-09-27T16:36:45.9488137Z (dropout2): Dropout(p=0.1, inplace=False) 2022-09-27T16:36:45.9488462Z (dropout3): Dropout(p=0.1, inplace=False) 2022-09-27T16:36:45.9488736Z ) 2022-09-27T16:36:45.9488970Z ) 2022-09-27T16:36:45.9489250Z ) 2022-09-27T16:36:45.9489524Z (1): FullyShardedDataParallel( 2022-09-27T16:36:45.9489877Z (_fsdp_wrapped_module): FlattenParamsWrapper( 2022-09-27T16:36:45.9490222Z (_fpw_module): TransformerDecoderLayer( 2022-09-27T16:36:45.9490557Z (self_attn): MultiheadAttention( 2022-09-27T16:36:45.9490972Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2022-09-27T16:36:45.9491308Z ) 2022-09-27T16:36:45.9491598Z (multihead_attn): MultiheadAttention( 2022-09-27T16:36:45.9492020Z (out_proj): NonDynamicallyQuantizableLinear(in_features=16, out_features=16, bias=True) 2022-09-27T16:36:45.9492351Z ) 2022-09-27T16:36:45.9492660Z (linear1): Linear(in_features=16, out_features=8, bias=True) 2022-09-27T16:36:45.9493014Z (dropout): Dropout(p=0.1, inplace=False) 2022-09-27T16:36:45.9493374Z (linear2): Linear(in_features=8, out_features=16, bias=True) 2022-09-27T16:36:45.9493821Z (norm1): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-09-27T16:36:45.9494275Z (norm2): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-09-27T16:36:45.9494722Z (norm3): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-09-27T16:36:45.9495060Z (dropout1): Dropout(p=0.1, inplace=False) 2022-09-27T16:36:45.9495388Z (dropout2): Dropout(p=0.1, inplace=False) 2022-09-27T16:36:45.9495715Z (dropout3): Dropout(p=0.1, inplace=False) 2022-09-27T16:36:45.9495979Z ) 2022-09-27T16:36:45.9496188Z ) 2022-09-27T16:36:45.9496408Z ) 2022-09-27T16:36:45.9496629Z ) 2022-09-27T16:36:45.9496985Z (norm): LayerNorm((16,), eps=1e-05, elementwise_affine=True) 2022-09-27T16:36:45.9497277Z ) 2022-09-27T16:36:45.9497497Z ) 2022-09-27T16:36:45.9497782Z (output_proj): Linear(in_features=16, out_features=23, bias=True) 2022-09-27T16:36:45.9498281Z (bn): BatchNorm1d(2, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 2022-09-27T16:36:45.9498599Z ) 2022-09-27T16:36:45.9498790Z ) 2022-09-27T16:36:45.9498996Z ) 2022-09-27T16:36:45.9499364Z ERROR: expected to be in states [] but current state is TrainingState_.SUMMON_FULL_PARAMS 2022-09-27T16:36:45.9499737Z File "", line 1, in 2022-09-27T16:36:45.9500109Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-09-27T16:36:45.9500478Z exitcode = _main(fd, parent_sentinel) 2022-09-27T16:36:45.9500830Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-09-27T16:36:45.9501199Z return self._bootstrap(parent_sentinel) 2022-09-27T16:36:45.9501581Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-09-27T16:36:45.9501922Z self.run() 2022-09-27T16:36:45.9502236Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-09-27T16:36:45.9502599Z self._target(*self._args, **self._kwargs) 2022-09-27T16:36:45.9503170Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 785, in _run 2022-09-27T16:36:45.9503548Z self.run_test(test_name, pipe) 2022-09-27T16:36:45.9504072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 622, in run_test 2022-09-27T16:36:45.9504461Z getattr(self, test_name)() 2022-09-27T16:36:45.9504956Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 503, in wrapper 2022-09-27T16:36:45.9505323Z fn() 2022-09-27T16:36:45.9505805Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 145, in wrapper 2022-09-27T16:36:45.9506188Z return func(*args, **kwargs) 2022-09-27T16:36:45.9506644Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_apply.py", line 101, in test_apply_in_summon_raises_error 2022-09-27T16:36:45.9507085Z transformer.apply(self._init_linear_weights) 2022-09-27T16:36:45.9507649Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 1655, in apply 2022-09-27T16:36:45.9508058Z self._assert_state(TrainingState_.IDLE) 2022-09-27T16:36:45.9508622Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 3549, in _assert_state 2022-09-27T16:36:45.9509032Z traceback.print_stack() 2022-09-27T16:36:46.4162679Z ok (5.352s) 2022-09-27T16:36:46.4166412Z test_nested_module_apply (__main__.TestApply) 2022-09-27T16:36:46.4181442Z Tests that ``apply()`` modifies parameter values in-place on a ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69440 2022-09-27T16:36:46.4188152Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69441 2022-09-27T16:36:48.0277527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:48.0278028Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:48.0278628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:48.0279099Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:48.0470806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:48.0471553Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:48.0474988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:48.0475470Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:48.2737150Z dist init r=1, world=2 2022-09-27T16:36:48.2741071Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:36:48.2772465Z dist init r=0, world=2 2022-09-27T16:36:48.2777817Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:36:48.2778943Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:36:48.2843812Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:36:49.6867047Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:36:49.6867895Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:36:49.7098389Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:36:49.7099216Z warnings.warn( 2022-09-27T16:36:49.7100356Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:36:49.7101104Z warnings.warn( 2022-09-27T16:36:50.3298215Z ok (3.913s) 2022-09-27T16:36:50.3301638Z test_transformer_module_apply (__main__.TestApply) 2022-09-27T16:36:50.3316160Z Tests that ``apply()`` modifies parameter values in-place on an ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69521 2022-09-27T16:36:50.3321614Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69522 2022-09-27T16:36:52.0255744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:52.0256265Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:52.0257590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:52.0258058Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:52.0344158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:52.0344617Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:52.0347835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:52.0348306Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:52.2610155Z dist init r=0, world=2 2022-09-27T16:36:52.2614547Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:36:52.2687908Z dist init r=1, world=2 2022-09-27T16:36:52.2693225Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:36:52.2694112Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:36:52.2717231Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:36:53.6637619Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:36:53.6638165Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:36:53.6962359Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:36:53.6963246Z warnings.warn( 2022-09-27T16:36:53.6964366Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:36:53.6965104Z warnings.warn( 2022-09-27T16:36:54.5402238Z ok (4.210s) 2022-09-27T16:36:54.5402442Z 2022-09-27T16:36:54.5403112Z ---------------------------------------------------------------------- 2022-09-27T16:36:54.5403468Z Ran 3 tests in 13.476s 2022-09-27T16:36:54.5403635Z 2022-09-27T16:36:54.5403736Z OK 2022-09-27T16:36:54.5403853Z 2022-09-27T16:36:54.5403987Z Generating XML reports... 2022-09-27T16:36:54.5441842Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_apply/TEST-TestApply-20220927163641.xml 2022-09-27T16:36:54.9165349Z Running distributed/_shard/sharded_tensor/ops/test_binary_cmp ... [2022-09-27 16:36:54.916022] 2022-09-27T16:36:54.9166165Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_binary_cmp.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:36:54.916096] 2022-09-27T16:36:56.7980331Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_binary_cmp 2022-09-27T16:36:56.7996470Z 2022-09-27T16:36:56.7996737Z Running tests... 2022-09-27T16:36:56.7997198Z ---------------------------------------------------------------------- 2022-09-27T16:36:56.8006766Z test_torch_allclose (__main__.TestShardedTensorBinaryOps) 2022-09-27T16:36:58.2654568Z Test torch.allclose(ShardedTensor, ShardedTensor) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:36:58.3283194Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69637 2022-09-27T16:36:58.3288502Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69638 2022-09-27T16:36:58.3295160Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 69639 2022-09-27T16:36:58.3301522Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 69640 2022-09-27T16:36:59.9743211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:59.9743729Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:59.9762717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:59.9763200Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:59.9784288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:59.9784740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:59.9787911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:59.9788477Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:36:59.9931178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:36:59.9931642Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:36:59.9933856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:36:59.9934329Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:00.0368694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:00.0369168Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:00.0370854Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:00.0371324Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:00.2076353Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:37:00.2165939Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:37:00.2225309Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:37:00.2648046Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:37:00.6360075Z skip: Need at least 4 CUDA devices (3.836s) 2022-09-27T16:37:00.6376569Z test_torch_allclose_tensor_specs (__main__.TestShardedTensorBinaryOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69773 2022-09-27T16:37:00.6383005Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69774 2022-09-27T16:37:00.6389346Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 69775 2022-09-27T16:37:00.6396800Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 69776 2022-09-27T16:37:02.2721816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:02.2723121Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:02.2724284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:02.2725234Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:02.3014669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:02.3015613Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:02.3017952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:02.3018931Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:02.3233512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:02.3234457Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:02.3236435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:02.3237359Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:02.3435187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:02.3436058Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:02.3437115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:02.3437920Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:02.5128412Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:37:02.5260860Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:37:02.5468288Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:37:02.5711284Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:37:02.9458123Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:37:02.9461837Z test_torch_equal (__main__.TestShardedTensorBinaryOps) 2022-09-27T16:37:02.9476811Z Test torch.equal(ShardedTensor, ShardedTensor) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69909 2022-09-27T16:37:02.9482983Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69910 2022-09-27T16:37:02.9489258Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 69911 2022-09-27T16:37:02.9496063Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 69912 2022-09-27T16:37:04.5711570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:04.5712597Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:04.5714004Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:04.5714919Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:04.5723029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:04.5723948Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:04.5727432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:04.5728397Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:04.5868524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:04.5869720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:04.5871297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:04.5872264Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:04.5928984Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:04.5929934Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:04.5932392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:04.5933360Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:04.8068211Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:37:04.8223791Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:37:04.8233976Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:37:04.8301202Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:37:05.2562635Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:37:05.2579317Z test_torch_equal_tensor_specs (__main__.TestShardedTensorBinaryOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70045 2022-09-27T16:37:05.2586311Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70046 2022-09-27T16:37:05.2593474Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 70047 2022-09-27T16:37:05.2600414Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 70048 2022-09-27T16:37:06.8745503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:06.8746080Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:06.8746680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:06.8747158Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:06.8844450Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:06.8844914Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:06.8847931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:06.8848411Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:06.9051920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:06.9052388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:06.9054615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:06.9055320Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:06.9512319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:06.9512854Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:06.9513949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:06.9514418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:07.1191402Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:37:07.1203443Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:37:07.1242453Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:37:07.1740150Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:37:07.5661714Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:37:07.5662035Z 2022-09-27T16:37:07.5662436Z ---------------------------------------------------------------------- 2022-09-27T16:37:07.5662759Z Ran 4 tests in 10.766s 2022-09-27T16:37:07.5662922Z 2022-09-27T16:37:07.5663048Z OK (skipped=4) 2022-09-27T16:37:07.5663243Z 2022-09-27T16:37:07.5663399Z Generating XML reports... 2022-09-27T16:37:07.5703171Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_binary_cmp/TEST-TestShardedTensorBinaryOps-20220927163656.xml 2022-09-27T16:37:07.9402105Z Running distributed/fsdp/test_fsdp_input ... [2022-09-27 16:37:07.939694] 2022-09-27T16:37:07.9402846Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_input.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:37:07.939775] 2022-09-27T16:37:09.8182495Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_input 2022-09-27T16:37:09.8199080Z 2022-09-27T16:37:09.8199384Z Running tests... 2022-09-27T16:37:09.8199827Z ---------------------------------------------------------------------- 2022-09-27T16:37:09.8211900Z test_input_type_dict (__main__.TestInput) 2022-09-27T16:37:11.3527164Z Test FSDP with input being a list or a dict, only single GPU. ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:37:11.3707170Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70216 2022-09-27T16:37:12.9913062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:12.9913543Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:12.9914904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:12.9915399Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:13.2181592Z dist init r=0, world=1 2022-09-27T16:37:13.2185642Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:37:13.2186417Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-09-27T16:37:14.4992532Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:37:14.5216944Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:37:14.5217817Z warnings.warn( 2022-09-27T16:37:15.3783236Z ok (5.558s) 2022-09-27T16:37:15.3795535Z test_input_type_list (__main__.TestInput) 2022-09-27T16:37:15.3809212Z Test FSDP with input being a list or a dict, only single GPU. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70259 2022-09-27T16:37:16.9794677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:16.9795153Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:16.9796532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:16.9797007Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:17.2178957Z dist init r=0, world=1 2022-09-27T16:37:17.2183452Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:37:17.2184558Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-09-27T16:37:18.5202036Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:37:18.5415198Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:37:18.5415987Z warnings.warn( 2022-09-27T16:37:19.3882901Z ok (4.010s) 2022-09-27T16:37:19.3883144Z 2022-09-27T16:37:19.3883755Z ---------------------------------------------------------------------- 2022-09-27T16:37:19.3884318Z Ran 2 tests in 9.568s 2022-09-27T16:37:19.3884486Z 2022-09-27T16:37:19.3884581Z OK 2022-09-27T16:37:19.3884723Z 2022-09-27T16:37:19.3884860Z Generating XML reports... 2022-09-27T16:37:19.3931576Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_input/TEST-TestInput-20220927163709.xml 2022-09-27T16:37:19.7602459Z Running distributed/_shard/sharded_tensor/ops/test_linear ... [2022-09-27 16:37:19.759739] 2022-09-27T16:37:19.7603258Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_linear.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:37:19.759814] 2022-09-27T16:37:21.6416568Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_linear 2022-09-27T16:37:21.6431994Z 2022-09-27T16:37:21.6432431Z Running tests... 2022-09-27T16:37:21.6432935Z ---------------------------------------------------------------------- 2022-09-27T16:37:23.1475913Z test_sharded_linear_colwise (__main__.TestShardedTensorOpsLinear) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:37:23.1658433Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70337 2022-09-27T16:37:23.1665493Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70338 2022-09-27T16:37:23.1671564Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 70339 2022-09-27T16:37:23.1678754Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 70340 2022-09-27T16:37:24.7718442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:24.7718960Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:24.7721139Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:24.7721626Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:24.7998660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:24.7999402Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:24.8002057Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:24.8002590Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:24.8054957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:24.8055539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:24.8059160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:24.8059737Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:24.8146554Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:24.8147256Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:24.8150933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:24.8163026Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:25.0218309Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:37:25.0307308Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:37:25.0381295Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:37:25.0414072Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:37:25.4736859Z skip: Need at least 4 CUDA devices (3.830s) 2022-09-27T16:37:25.4778245Z test_sharded_linear_errors (__main__.TestShardedTensorOpsLinear) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70473 2022-09-27T16:37:25.4784930Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70474 2022-09-27T16:37:25.4791689Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 70475 2022-09-27T16:37:25.4799458Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 70476 2022-09-27T16:37:27.0836418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:27.0836933Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:27.0837958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:27.0838444Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:27.1087054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:27.1087535Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:27.1090929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:27.1091417Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:27.1532902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:27.1533597Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:27.1535101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:27.1535575Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:27.1677017Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:27.1677522Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:27.1680872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:27.1681373Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:27.3241453Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:37:27.3371056Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:37:27.3740659Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:37:27.3929542Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:37:27.7853693Z skip: Need at least 4 CUDA devices (2.312s) 2022-09-27T16:37:27.7880460Z test_sharded_linear_rowwise (__main__.TestShardedTensorOpsLinear) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70609 2022-09-27T16:37:27.7887329Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70610 2022-09-27T16:37:27.7894156Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 70611 2022-09-27T16:37:27.7900781Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 70612 2022-09-27T16:37:29.4020664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:29.4021192Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:29.4022432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:29.4022918Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:29.4042307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:29.4042791Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:29.4047009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:29.4047508Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:29.4585094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:29.4585591Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:29.4588266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:29.4588727Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:29.5057629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:29.5058167Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:29.5060674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:29.5061179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:29.6266028Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:37:29.6394736Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:37:29.6748876Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:37:29.7350531Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:37:30.0956656Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:37:30.0956922Z 2022-09-27T16:37:30.0957305Z ---------------------------------------------------------------------- 2022-09-27T16:37:30.0957667Z Ran 3 tests in 8.452s 2022-09-27T16:37:30.0957838Z 2022-09-27T16:37:30.0957950Z OK (skipped=3) 2022-09-27T16:37:30.0958085Z 2022-09-27T16:37:30.0958214Z Generating XML reports... 2022-09-27T16:37:30.0998707Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_linear/TEST-TestShardedTensorOpsLinear-20220927163721.xml 2022-09-27T16:37:30.4712665Z Running distributed/_shard/sharded_tensor/ops/test_init ... [2022-09-27 16:37:30.470749] 2022-09-27T16:37:30.4713458Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_init.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:37:30.470824] 2022-09-27T16:37:32.3420356Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_init 2022-09-27T16:37:32.3436065Z 2022-09-27T16:37:32.3436351Z Running tests... 2022-09-27T16:37:32.3436798Z ---------------------------------------------------------------------- 2022-09-27T16:37:32.3448768Z test_init_sharded_tensor_with_kaiming_uniform (__main__.TestShardedTensorNNInit) 2022-09-27T16:37:33.8137407Z Test torch.nn.init.kaiming_uniform_(ShardedTensor, a, mode, nonlinearit) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:37:33.8768181Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70780 2022-09-27T16:37:33.8773081Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70781 2022-09-27T16:37:33.8779327Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 70782 2022-09-27T16:37:33.8785671Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 70783 2022-09-27T16:37:35.4828724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:35.4829684Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:35.4831217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:35.4832163Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:35.4833329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:35.4834245Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:35.4835445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:35.4836357Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:35.5049391Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:35.5050316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:35.5052269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:35.5053255Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:35.5120403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:35.5121330Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:35.5122953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:35.5123948Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:35.7130277Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:37:35.7263167Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:37:35.7365052Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:37:35.7402318Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:37:36.1844627Z skip: Need at least 4 CUDA devices (3.841s) 2022-09-27T16:37:36.1856404Z test_init_sharded_tensor_with_normal (__main__.TestShardedTensorNNInit) 2022-09-27T16:37:36.1871115Z Test torch.nn.init.normal_(ShardedTensor, mean, std) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70916 2022-09-27T16:37:36.1878067Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70917 2022-09-27T16:37:36.1884836Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 70918 2022-09-27T16:37:36.1892316Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 70919 2022-09-27T16:37:37.8048799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:37.8049800Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:37.8050970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:37.8052144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:37.8053293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:37.8054252Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:37.8055419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:37.8056338Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:37.8057504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:37.8058470Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:37.8059626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:37.8060568Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:37.8258094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:37.8258595Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:37.8260674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:37.8261164Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:38.0759391Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:37:38.0766609Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:37:38.0795467Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:37:38.0845811Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:37:38.4956211Z skip: Need at least 4 CUDA devices (2.311s) 2022-09-27T16:37:38.4966755Z test_init_sharded_tensor_with_uniform (__main__.TestShardedTensorNNInit) 2022-09-27T16:37:38.4980426Z Test torch.nn.init.uniform_(ShardedTensor, a, b) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71052 2022-09-27T16:37:38.4986751Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71053 2022-09-27T16:37:38.4993537Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 71054 2022-09-27T16:37:38.5000700Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 71055 2022-09-27T16:37:40.0946433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:40.0946948Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:40.0947527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:40.0948015Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:40.1313590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:40.1314080Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:40.1317656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:40.1318147Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:40.1357081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:40.1357543Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:40.1360584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:40.1361328Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:40.1623343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:40.1623869Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:40.1624458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:40.1624938Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:40.3405567Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:37:40.3614084Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:37:40.3685201Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:37:40.3798370Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:37:40.8060574Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:37:40.8060835Z 2022-09-27T16:37:40.8061247Z ---------------------------------------------------------------------- 2022-09-27T16:37:40.8061599Z Ran 3 tests in 8.462s 2022-09-27T16:37:40.8061762Z 2022-09-27T16:37:40.8061851Z OK (skipped=3) 2022-09-27T16:37:40.8062012Z 2022-09-27T16:37:40.8062146Z Generating XML reports... 2022-09-27T16:37:40.8101273Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_init/TEST-TestShardedTensorNNInit-20220927163732.xml 2022-09-27T16:37:41.1751011Z Running distributed/_shard/sharded_tensor/test_sharded_tensor_reshard ... [2022-09-27 16:37:41.174570] 2022-09-27T16:37:41.1751851Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/test_sharded_tensor_reshard.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:37:41.174645] 2022-09-27T16:37:43.0592175Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor_reshard 2022-09-27T16:37:43.0609122Z 2022-09-27T16:37:43.0609661Z Running tests... 2022-09-27T16:37:43.0610149Z ---------------------------------------------------------------------- 2022-09-27T16:37:44.5882230Z test_sharded_tensor_reshard (__main__.TestReshard) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:37:44.6546174Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71223 2022-09-27T16:37:44.6551889Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71224 2022-09-27T16:37:44.6559290Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 71225 2022-09-27T16:37:44.6566362Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 71226 2022-09-27T16:37:46.2587693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:46.2588496Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:46.2589319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:46.2589802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:46.2896459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:46.2897101Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:46.2898448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:46.2898960Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:46.2920237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:46.2920873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:46.2923442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:46.2923911Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:46.2966355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:46.2966815Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:46.2969531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:46.2969991Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:46.5090821Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:37:46.5202893Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:37:46.5223634Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:37:46.5341204Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:37:46.9625243Z skip: Need at least 4 CUDA devices (3.901s) 2022-09-27T16:37:46.9649024Z test_sharded_tensor_reshard_errors (__main__.TestReshard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71359 2022-09-27T16:37:46.9655868Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71360 2022-09-27T16:37:46.9662850Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 71361 2022-09-27T16:37:46.9670097Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 71362 2022-09-27T16:37:48.5723091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:48.5723618Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:48.5724212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:48.5724704Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:48.5765149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:48.5765617Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:48.5769039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:48.5769519Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:48.5854175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:48.5854653Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:48.5857654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:48.5858345Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:48.6073123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:48.6073608Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:48.6076526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:48.6077014Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:48.8060851Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:37:48.8184273Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:37:48.8202454Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:37:48.8387955Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:37:49.2726168Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:37:49.2726431Z 2022-09-27T16:37:49.2726836Z ---------------------------------------------------------------------- 2022-09-27T16:37:49.2727181Z Ran 2 tests in 6.212s 2022-09-27T16:37:49.2727348Z 2022-09-27T16:37:49.2727458Z OK (skipped=2) 2022-09-27T16:37:49.2727604Z 2022-09-27T16:37:49.2727735Z Generating XML reports... 2022-09-27T16:37:49.2766339Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor_reshard/TEST-TestReshard-20220927163743.xml 2022-09-27T16:37:49.6358696Z Running distributed/_shard/sharded_tensor/ops/test_embedding_bag ... [2022-09-27 16:37:49.635347] 2022-09-27T16:37:49.6359520Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_embedding_bag.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:37:49.635420] 2022-09-27T16:37:51.4631851Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_embedding_bag 2022-09-27T16:37:51.4646895Z 2022-09-27T16:37:51.4647221Z Running tests... 2022-09-27T16:37:51.4647914Z ---------------------------------------------------------------------- 2022-09-27T16:37:52.9347022Z test_sharded_embedding_bag_colwise (__main__.TestShardedEmbeddingBag) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:37:52.9975722Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71530 2022-09-27T16:37:52.9980062Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71531 2022-09-27T16:37:52.9987787Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 71532 2022-09-27T16:37:52.9996405Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 71533 2022-09-27T16:37:54.5987265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:54.5987981Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:54.5988594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:54.5989073Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:54.6096671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:54.6097141Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:54.6099076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:54.6099560Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:54.6375280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:54.6375763Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:54.6379108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:54.6379612Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:54.6875512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:54.6876007Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:54.6877112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:54.6877594Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:54.8412851Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:37:54.8427808Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:37:54.8575585Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:37:54.9128026Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:37:55.3064571Z skip: Need at least 4 CUDA devices (3.841s) 2022-09-27T16:37:55.3082346Z test_sharded_embedding_bag_rowwise (__main__.TestShardedEmbeddingBag) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71666 2022-09-27T16:37:55.3088438Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71667 2022-09-27T16:37:55.3094919Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 71668 2022-09-27T16:37:55.3101583Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 71669 2022-09-27T16:37:56.9205378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:56.9206378Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:56.9207518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:56.9208401Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:56.9209599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:56.9210576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:56.9211759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:56.9212713Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:56.9258802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:56.9259722Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:56.9261710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:56.9262555Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:56.9326179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:37:56.9327128Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:37:56.9328878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:37:56.9329811Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:37:57.1542264Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:37:57.1580966Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:37:57.1601434Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:37:57.1641042Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:37:57.6166725Z skip: Need at least 4 CUDA devices (2.310s) 2022-09-27T16:37:57.6166978Z 2022-09-27T16:37:57.6167394Z ---------------------------------------------------------------------- 2022-09-27T16:37:57.6167744Z Ran 2 tests in 6.152s 2022-09-27T16:37:57.6167916Z 2022-09-27T16:37:57.6168010Z OK (skipped=2) 2022-09-27T16:37:57.6168169Z 2022-09-27T16:37:57.6168302Z Generating XML reports... 2022-09-27T16:37:57.6206403Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_embedding_bag/TEST-TestShardedEmbeddingBag-20220927163751.xml 2022-09-27T16:37:57.9849493Z Running distributed/fsdp/test_fsdp_multiple_forward ... [2022-09-27 16:37:57.984459] 2022-09-27T16:37:57.9850463Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_multiple_forward.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:37:57.984536] 2022-09-27T16:37:59.8560446Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_multiple_forward 2022-09-27T16:37:59.8578416Z 2022-09-27T16:37:59.8578871Z Running tests... 2022-09-27T16:37:59.8579354Z ---------------------------------------------------------------------- 2022-09-27T16:38:01.4057333Z test_multi_forward (__main__.TestMultiForward) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:38:01.4235390Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71837 2022-09-27T16:38:01.4242804Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71838 2022-09-27T16:38:03.0608925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:38:03.0609882Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:38:03.0611783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:38:03.0612680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:38:03.0732763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:38:03.0733245Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:38:03.0736053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:38:03.0736545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:38:03.3215722Z dist init r=0, world=2 2022-09-27T16:38:03.3216160Z dist init r=1, world=2 2022-09-27T16:38:03.3220648Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:38:03.3222102Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:38:03.3222661Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:38:03.3223336Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:38:04.7239399Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:38:04.7240397Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:38:05.1671876Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:38:05.1672642Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-09-27T16:38:05.1712965Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:38:05.1714589Z warnings.warn( 2022-09-27T16:38:05.1717039Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:38:05.1718390Z warnings.warn( 2022-09-27T16:38:05.6329744Z ok (5.775s) 2022-09-27T16:38:05.6330301Z 2022-09-27T16:38:05.6330717Z ---------------------------------------------------------------------- 2022-09-27T16:38:05.6331062Z Ran 1 test in 5.775s 2022-09-27T16:38:05.6331276Z 2022-09-27T16:38:05.6331371Z OK 2022-09-27T16:38:05.6331524Z 2022-09-27T16:38:05.6331640Z Generating XML reports... 2022-09-27T16:38:05.6367435Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_multiple_forward/TEST-TestMultiForward-20220927163759.xml 2022-09-27T16:38:06.0060450Z Running distributed/fsdp/test_fsdp_pure_fp16 ... [2022-09-27 16:38:06.005567] 2022-09-27T16:38:06.0061207Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_pure_fp16.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:38:06.005648] 2022-09-27T16:38:07.8474276Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_pure_fp16 2022-09-27T16:38:07.8489718Z 2022-09-27T16:38:07.8489870Z Running tests... 2022-09-27T16:38:07.8490519Z ---------------------------------------------------------------------- 2022-09-27T16:38:07.8495792Z test_pure_fp16_cpu_offload_CPUOffload(offload_params=False) (__main__.TestPureFP16) 2022-09-27T16:38:09.3571567Z Tests pure FP16 training, including when the parameter's dtype is ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:38:09.3727061Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/73315 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.523s) 2022-09-27T16:38:09.3731648Z test_pure_fp16_cpu_offload_CPUOffload(offload_params=True) (__main__.TestPureFP16) 2022-09-27T16:38:09.3759794Z Tests pure FP16 training, including when the parameter's dtype is ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71957 2022-09-27T16:38:09.3766588Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71958 2022-09-27T16:38:10.9679081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:38:10.9679723Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:38:10.9680709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:38:10.9681195Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:38:11.0088393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:38:11.0088868Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:38:11.0091596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:38:11.0092078Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:38:11.2190720Z dist init r=0, world=2 2022-09-27T16:38:11.2194936Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:38:11.2423378Z dist init r=1, world=2 2022-09-27T16:38:11.2428816Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:38:11.2429611Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:38:11.2500306Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:38:12.6141968Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:38:12.6142491Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:38:13.0420812Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:38:13.0422149Z warnings.warn( 2022-09-27T16:38:13.0423522Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1414: UserWarning: Module is put on CPU and will thus have flattening and sharding run on CPU, which is less efficient than on GPU. We recommend passing in `device_id` argument which will enable FSDP to put module on GPU device, module must also be on GPU device to work with `sync_module_states=True` flag which requires GPU communication. 2022-09-27T16:38:13.0424273Z warnings.warn( 2022-09-27T16:38:13.4850127Z ok (4.112s) 2022-09-27T16:38:13.4850429Z 2022-09-27T16:38:13.4850997Z ---------------------------------------------------------------------- 2022-09-27T16:38:13.4851349Z Ran 2 tests in 5.636s 2022-09-27T16:38:13.4851493Z 2022-09-27T16:38:13.4851608Z OK (skipped=1) 2022-09-27T16:38:13.4851759Z 2022-09-27T16:38:13.4851893Z Generating XML reports... 2022-09-27T16:38:13.4889948Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_pure_fp16/TEST-TestPureFP16-20220927163807.xml 2022-09-27T16:38:13.8611180Z Running distributed/elastic/timer/local_timer_test ... [2022-09-27 16:38:13.860621] 2022-09-27T16:38:13.8611933Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/elastic/timer/local_timer_test.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:38:13.860694] 2022-09-27T16:38:15.7211695Z Test results will be stored in test-reports/python-unittest/distributed.elastic.timer.local_timer_test 2022-09-27T16:38:15.7230113Z 2022-09-27T16:38:15.7230417Z Running tests... 2022-09-27T16:38:15.7230856Z ---------------------------------------------------------------------- 2022-09-27T16:38:15.7237494Z test_acquire_release (__main__.LocalTimerServerTest) 2022-09-27T16:38:17.2692160Z tests that: ... ok (1.546s) 2022-09-27T16:38:17.2699678Z test_expired_timers (__main__.LocalTimerServerTest) 2022-09-27T16:38:17.2717964Z tests that a single expired timer on a process should terminate ... ok (0.003s) 2022-09-27T16:38:17.2729867Z test_valid_timers (__main__.LocalTimerServerTest) 2022-09-27T16:38:17.2748319Z tests that valid timers are processed correctly and the process is left alone ... ok (0.003s) 2022-09-27T16:38:17.2756132Z test_watchdog_call_count (__main__.LocalTimerServerTest) 2022-09-27T16:38:17.3787219Z checks that the watchdog function ran wait/interval +- 1 times ... ok (0.104s) 2022-09-27T16:38:17.3790077Z test_watchdog_empty_queue (__main__.LocalTimerServerTest) 2022-09-27T16:38:17.3899774Z checks that the watchdog can run on an empty queue ... ok (0.011s) 2022-09-27T16:38:17.3942646Z test_client_interaction (__main__.LocalTimerTest) ... ok (0.004s) 2022-09-27T16:38:17.4057064Z test_exception_propagation (__main__.LocalTimerTest) ... ok (0.011s) 2022-09-27T16:38:17.4065678Z test_get_timer_recursive (__main__.LocalTimerTest) 2022-09-27T16:38:19.3272554Z If a function acquires a countdown timer with default scope, ... /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:38:19.3273134Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:38:19.3274757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:38:19.3275246Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:38:19.7558593Z ok (2.350s) 2022-09-27T16:38:19.8585751Z test_happy_path (__main__.LocalTimerTest) ... ok (0.103s) 2022-09-27T16:38:19.8699641Z test_no_client (__main__.LocalTimerTest) ... ok (0.011s) 2022-09-27T16:38:20.0282050Z test_timer (__main__.LocalTimerTest) ... ok (0.158s) 2022-09-27T16:38:20.0517138Z test_get (__main__.MultiprocessingRequestQueueTest) ... ok (0.023s) 2022-09-27T16:38:20.0526340Z test_get_less_than_size (__main__.MultiprocessingRequestQueueTest) 2022-09-27T16:38:20.5694216Z Tests slow producer. ... ok (0.517s) 2022-09-27T16:38:20.5712819Z test_get_size (__main__.MultiprocessingRequestQueueTest) 2022-09-27T16:38:21.4917041Z Creates a "producer" process that enqueues ``n`` elements ... ok (0.922s) 2022-09-27T16:38:21.4921392Z 2022-09-27T16:38:21.4922407Z ---------------------------------------------------------------------- 2022-09-27T16:38:21.4922793Z Ran 14 tests in 5.769s 2022-09-27T16:38:21.4922965Z 2022-09-27T16:38:21.4923067Z OK 2022-09-27T16:38:21.4923202Z 2022-09-27T16:38:21.4923332Z Generating XML reports... 2022-09-27T16:38:21.4982957Z Generated XML report: test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-LocalTimerServerTest-20220927163815.xml 2022-09-27T16:38:21.4992208Z Generated XML report: test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-LocalTimerTest-20220927163815.xml 2022-09-27T16:38:21.4998603Z Generated XML report: test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-MultiprocessingRequestQueueTest-20220927163815.xml 2022-09-27T16:38:22.0125865Z Running distributed/fsdp/test_fsdp_traversal ... [2022-09-27 16:38:22.012118] 2022-09-27T16:38:22.0126577Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_traversal.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:38:22.012195] 2022-09-27T16:38:23.8540019Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_traversal 2022-09-27T16:38:23.8557471Z 2022-09-27T16:38:23.8558079Z Running tests... 2022-09-27T16:38:23.8558953Z ---------------------------------------------------------------------- 2022-09-27T16:38:25.4091872Z test_fsdp_modules (__main__.TestTraversal) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:38:25.4270508Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72167 2022-09-27T16:38:25.4277192Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72168 2022-09-27T16:38:27.0497670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:38:27.0498227Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:38:27.0499126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:38:27.0499608Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:38:27.0701660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:38:27.0702126Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:38:27.0705703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:38:27.0706195Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:38:27.2918872Z dist init r=1, world=2 2022-09-27T16:38:27.2922887Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-09-27T16:38:27.3013683Z dist init r=0, world=2 2022-09-27T16:38:27.3018542Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-09-27T16:38:27.3019736Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:38:27.3025337Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-09-27T16:38:28.7079041Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:38:28.7079563Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:38:29.1369880Z ok (5.281s) 2022-09-27T16:38:29.1370105Z 2022-09-27T16:38:29.1370485Z ---------------------------------------------------------------------- 2022-09-27T16:38:29.1370845Z Ran 1 test in 5.281s 2022-09-27T16:38:29.1371014Z 2022-09-27T16:38:29.1372387Z OK 2022-09-27T16:38:29.1372585Z 2022-09-27T16:38:29.1372981Z Generating XML reports... 2022-09-27T16:38:29.1407831Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_traversal/TEST-TestTraversal-20220927163823.xml 2022-09-27T16:38:29.5307645Z Running distributed/elastic/utils/distributed_test ... [2022-09-27 16:38:29.530282] 2022-09-27T16:38:29.5308412Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/elastic/utils/distributed_test.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:38:29.530356] 2022-09-27T16:38:31.4721320Z Test results will be stored in test-reports/python-unittest/distributed.elastic.utils.distributed_test 2022-09-27T16:38:31.4737203Z 2022-09-27T16:38:31.4737467Z Running tests... 2022-09-27T16:38:31.4737903Z ---------------------------------------------------------------------- 2022-09-27T16:38:33.0362225Z test_create_store_multi (__main__.DistributedUtilTest) ... ok (1.562s) 2022-09-27T16:38:33.0375055Z test_create_store_no_port_multi (__main__.DistributedUtilTest) ... ok (0.001s) 2022-09-27T16:38:33.0381534Z test_create_store_single_server (__main__.DistributedUtilTest) ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/66207 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.000s) 2022-09-27T16:38:36.0478079Z test_create_store_timeout_on_server (__main__.DistributedUtilTest) ... ok (3.009s) 2022-09-27T16:38:36.0489280Z test_create_store_timeout_on_worker (__main__.DistributedUtilTest) ... [E socket.cpp:860] [c10d] The client socket has timed out after 1s while trying to connect to (ac37d1fee4fc, 0). 2022-09-27T16:38:36.0489791Z ok (0.001s) 2022-09-27T16:38:36.0507983Z test_port_already_in_use_on_server (__main__.DistributedUtilTest) ... [W socket.cpp:426] [c10d] The server socket has failed to bind to [::]:33435 (errno: 98 - Address already in use). 2022-09-27T16:38:36.0530577Z [W socket.cpp:426] [c10d] The server socket has failed to bind to 0.0.0.0:33435 (errno: 98 - Address already in use). 2022-09-27T16:38:36.0531052Z [E socket.cpp:462] [c10d] The server socket has failed to listen on any local network address. 2022-09-27T16:38:36.0534404Z ok (0.004s) 2022-09-27T16:38:36.0570339Z test_port_already_in_use_on_worker (__main__.DistributedUtilTest) ... [E socket.cpp:860] [c10d] The client socket has timed out after 1s while trying to connect to (ac37d1fee4fc, 52715). 2022-09-27T16:38:36.0571557Z ok (0.004s) 2022-09-27T16:38:36.0574492Z 2022-09-27T16:38:36.0575055Z ---------------------------------------------------------------------- 2022-09-27T16:38:36.0575436Z Ran 7 tests in 4.584s 2022-09-27T16:38:36.0575622Z 2022-09-27T16:38:36.0575745Z OK (skipped=1) 2022-09-27T16:38:36.0575908Z 2022-09-27T16:38:36.0576042Z Generating XML reports... 2022-09-27T16:38:36.1259456Z Generated XML report: test-reports/python-unittest/distributed.elastic.utils.distributed_test/TEST-DistributedUtilTest-20220927163831.xml 2022-09-27T16:38:36.5009102Z Running distributed/_shard/sharded_optim/test_sharded_optim ... [2022-09-27 16:38:36.500454] 2022-09-27T16:38:36.5009893Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_optim/test_sharded_optim.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:38:36.500529] 2022-09-27T16:38:38.3278083Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_optim.test_sharded_optim 2022-09-27T16:38:38.3294142Z 2022-09-27T16:38:38.3294537Z Running tests... 2022-09-27T16:38:38.3295057Z ---------------------------------------------------------------------- 2022-09-27T16:38:39.8485802Z test_named_params_with_sharded_tensor (__main__.TestShardedOptimizer) ... INFO:numba.cuda.cudadrv.driver:init 2022-09-27T16:38:39.8642061Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82023 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.535s) 2022-09-27T16:38:39.8687397Z test_sharded_optim (__main__.TestShardedOptimizer) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72327 2022-09-27T16:38:39.8694235Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72328 2022-09-27T16:38:39.8701089Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 72329 2022-09-27T16:38:39.8707814Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 72330 2022-09-27T16:38:41.4918577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:38:41.4919294Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:38:41.4921020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:38:41.4921505Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:38:41.4999643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:38:41.5000113Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:38:41.5003875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:38:41.5004361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:38:41.5121674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:38:41.5122157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:38:41.5125659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:38:41.5126143Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:38:41.5492987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:123: UserWarning: loaded 45 slow tests 2022-09-27T16:38:41.5493599Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-09-27T16:38:41.5496211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:127: UserWarning: loaded 261 disabled tests 2022-09-27T16:38:41.5496866Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-09-27T16:38:41.7301730Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-09-27T16:38:41.7329450Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-09-27T16:38:41.7433261Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-09-27T16:38:41.7624030Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-09-27T16:38:42.1765653Z skip: Need at least 4 CUDA devices (2.312s) 2022-09-27T16:38:42.1766097Z 2022-09-27T16:38:42.1766524Z ---------------------------------------------------------------------- 2022-09-27T16:38:42.1766873Z Ran 2 tests in 3.847s 2022-09-27T16:38:42.1767026Z 2022-09-27T16:38:42.1767210Z OK (skipped=2) 2022-09-27T16:38:42.1767497Z 2022-09-27T16:38:42.1767655Z Generating XML reports... 2022-09-27T16:38:42.1805986Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_optim.test_sharded_optim/TEST-TestShardedOptimizer-20220927163838.xml 2022-09-27T16:38:42.5477763Z Running distributed/fsdp/test_flatten_params_wrapper ... [2022-09-27 16:38:42.547288] 2022-09-27T16:38:42.5478810Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_flatten_params_wrapper.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:38:42.547363] 2022-09-27T16:38:44.4573712Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_flatten_params_wrapper 2022-09-27T16:38:44.4594039Z 2022-09-27T16:38:44.4594293Z Running tests... 2022-09-27T16:38:44.4594729Z ---------------------------------------------------------------------- 2022-09-27T16:38:46.0319231Z test_empty_module (__main__.TestFlattenParams) ... ok (1.572s) 2022-09-27T16:38:46.0417228Z test_flatten_nothing (__main__.TestFlattenParams) ... ok (0.010s) 2022-09-27T16:38:46.0516142Z test_num_params (__main__.TestFlattenParams) ... ok (0.010s) 2022-09-27T16:38:46.0784303Z test_output (__main__.TestFlattenParams) ... ok (0.027s) 2022-09-27T16:38:46.0900356Z test_partial_flattening (__main__.TestFlattenParams) ... ok (0.012s) 2022-09-27T16:38:46.0984388Z test_sharded_flat_param (__main__.TestFlattenParams) ... ok (0.008s) 2022-09-27T16:38:46.1081518Z test_shared_params_num_params (__main__.TestFlattenParams) ... ok (0.010s) 2022-09-27T16:38:46.1296745Z test_shared_params_output (__main__.TestFlattenParams) ... ok (0.021s) 2022-09-27T16:38:46.1705998Z test_shared_params_pnorm_after_step (__main__.TestFlattenParams) ... ok (0.041s) 2022-09-27T16:38:46.1720548Z test_empty_module (__main__.TestFlattenParamsCUDA) ... ok (0.001s) 2022-09-27T16:38:46.1822206Z test_flatten_nothing (__main__.TestFlattenParamsCUDA) ... ok (0.010s) 2022-09-27T16:38:46.1946578Z test_num_params (__main__.TestFlattenParamsCUDA) ... ok (0.012s) 2022-09-27T16:38:46.6228443Z test_output (__main__.TestFlattenParamsCUDA) ... ok (0.428s) 2022-09-27T16:38:46.6375841Z test_partial_flattening (__main__.TestFlattenParamsCUDA) ... ok (0.015s) 2022-09-27T16:38:46.6470094Z test_sharded_flat_param (__main__.TestFlattenParamsCUDA) ... ok (0.009s) 2022-09-27T16:38:46.6592668Z test_shared_params_num_params (__main__.TestFlattenParamsCUDA) ... ok (0.012s) 2022-09-27T16:38:46.6829568Z test_shared_params_output (__main__.TestFlattenParamsCUDA) ... ok (0.024s) 2022-09-27T16:38:46.7307428Z test_shared_params_pnorm_after_step (__main__.TestFlattenParamsCUDA) ... ok (0.048s) 2022-09-27T16:38:46.7322699Z test_empty_module (__main__.TestFlattenParamsCUDAHalf) ... ok (0.001s) 2022-09-27T16:38:46.7441435Z test_flatten_nothing (__main__.TestFlattenParamsCUDAHalf) ... ok (0.012s) 2022-09-27T16:38:46.7576793Z test_num_params (__main__.TestFlattenParamsCUDAHalf) ... ok (0.013s) 2022-09-27T16:38:46.7832197Z test_output (__main__.TestFlattenParamsCUDAHalf) ... ok (0.025s) 2022-09-27T16:38:46.7994717Z test_partial_flattening (__main__.TestFlattenParamsCUDAHalf) ... ok (0.016s) 2022-09-27T16:38:46.8076972Z test_sharded_flat_param (__main__.TestFlattenParamsCUDAHalf) ... ok (0.008s) 2022-09-27T16:38:46.8220808Z test_shared_params_num_params (__main__.TestFlattenParamsCUDAHalf) ... ok (0.014s) 2022-09-27T16:38:46.8476259Z test_shared_params_output (__main__.TestFlattenParamsCUDAHalf) ... ok (0.025s) 2022-09-27T16:38:46.9007655Z test_shared_params_pnorm_after_step (__main__.TestFlattenParamsCUDAHalf) ... ok (0.053s) 2022-09-27T16:38:46.9007976Z 2022-09-27T16:38:46.9008365Z ---------------------------------------------------------------------- 2022-09-27T16:38:46.9008704Z Ran 27 tests in 2.441s 2022-09-27T16:38:46.9008871Z 2022-09-27T16:38:46.9008948Z OK 2022-09-27T16:38:46.9009087Z 2022-09-27T16:38:46.9009218Z Generating XML reports... 2022-09-27T16:38:46.9054358Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_flatten_params_wrapper/TEST-TestFlattenParams-20220927163844.xml 2022-09-27T16:38:46.9065800Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_flatten_params_wrapper/TEST-TestFlattenParamsCUDA-20220927163844.xml 2022-09-27T16:38:46.9077013Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_flatten_params_wrapper/TEST-TestFlattenParamsCUDAHalf-20220927163844.xml 2022-09-27T16:38:47.3287846Z Running distributed/elastic/utils/logging_test ... [2022-09-27 16:38:47.328293] 2022-09-27T16:38:47.3288644Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/elastic/utils/logging_test.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:38:47.328370] 2022-09-27T16:38:49.2472345Z Test results will be stored in test-reports/python-unittest/distributed.elastic.utils.logging_test 2022-09-27T16:38:49.2488512Z 2022-09-27T16:38:49.2489047Z Running tests... 2022-09-27T16:38:49.2489551Z ---------------------------------------------------------------------- 2022-09-27T16:38:50.7880963Z test_derive_module_name (__main__.LoggingTest) ... ok (1.539s) 2022-09-27T16:38:50.7902471Z test_logger_name (__main__.LoggingTest) ... ok (0.002s) 2022-09-27T16:38:50.7903191Z 2022-09-27T16:38:50.7903582Z ---------------------------------------------------------------------- 2022-09-27T16:38:50.7903954Z Ran 2 tests in 1.541s 2022-09-27T16:38:50.7904122Z 2022-09-27T16:38:50.7904202Z OK 2022-09-27T16:38:50.7904341Z 2022-09-27T16:38:50.7904473Z Generating XML reports... 2022-09-27T16:38:50.8405459Z Generated XML report: test-reports/python-unittest/distributed.elastic.utils.logging_test/TEST-LoggingTest-20220927163849.xml 2022-09-27T16:38:51.2102035Z Running distributed/test_launcher ... [2022-09-27 16:38:51.209722] 2022-09-27T16:38:51.2102760Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_launcher.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:38:51.209803] 2022-09-27T16:38:53.5442541Z Test results will be stored in test-reports/python-unittest/distributed.test_launcher 2022-09-27T16:38:53.5458158Z 2022-09-27T16:38:53.5458434Z Running tests... 2022-09-27T16:38:53.5458869Z ---------------------------------------------------------------------- 2022-09-27T16:38:55.1135846Z test_launch_user_script (__main__.TestDistributedLaunch) ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/79488 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.567s) 2022-09-27T16:38:55.1136560Z 2022-09-27T16:38:55.1136828Z ---------------------------------------------------------------------- 2022-09-27T16:38:55.1137164Z Ran 1 test in 1.568s 2022-09-27T16:38:55.1137333Z 2022-09-27T16:38:55.1137444Z OK (skipped=1) 2022-09-27T16:38:55.1137605Z 2022-09-27T16:38:55.1137733Z Generating XML reports... 2022-09-27T16:38:55.1169615Z Generated XML report: test-reports/python-unittest/distributed.test_launcher/TEST-TestDistributedLaunch-20220927163853.xml 2022-09-27T16:38:55.4857585Z Running distributed/_shard/checkpoint/test_utils ... [2022-09-27 16:38:55.485304] 2022-09-27T16:38:55.4858342Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/checkpoint/test_utils.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:38:55.485377] 2022-09-27T16:38:57.3411720Z Test results will be stored in test-reports/python-unittest/distributed._shard.checkpoint.test_utils 2022-09-27T16:38:57.3427430Z 2022-09-27T16:38:57.3427573Z Running tests... 2022-09-27T16:38:57.3428028Z ---------------------------------------------------------------------- 2022-09-27T16:38:58.9214602Z test_flat_data (__main__.TestMedatadaIndex) ... ok (1.578s) 2022-09-27T16:38:58.9223293Z test_index_hint_ignored_on_equals (__main__.TestMedatadaIndex) ... ok (0.001s) 2022-09-27T16:38:58.9232820Z test_index_hint_ignored_on_hash (__main__.TestMedatadaIndex) ... ok (0.001s) 2022-09-27T16:38:58.9242818Z test_init_convert_offset (__main__.TestMedatadaIndex) ... ok (0.001s) 2022-09-27T16:38:58.9277949Z test_sharded_tensor_lookup (__main__.TestMedatadaIndex) ... ok (0.003s) 2022-09-27T16:38:58.9278490Z 2022-09-27T16:38:58.9279213Z ---------------------------------------------------------------------- 2022-09-27T16:38:58.9280292Z Ran 5 tests in 1.585s 2022-09-27T16:38:58.9280641Z 2022-09-27T16:38:58.9280817Z OK 2022-09-27T16:38:58.9281088Z 2022-09-27T16:38:58.9281329Z Generating XML reports... 2022-09-27T16:38:58.9320487Z Generated XML report: test-reports/python-unittest/distributed._shard.checkpoint.test_utils/TEST-TestMedatadaIndex-20220927163857.xml 2022-09-27T16:38:59.3081449Z Running distributed/test_nccl ... [2022-09-27 16:38:59.307565] 2022-09-27T16:38:59.3082844Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_nccl.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:38:59.307645] 2022-09-27T16:39:02.6265299Z Test results will be stored in test-reports/python-unittest/distributed.test_nccl 2022-09-27T16:39:02.6288637Z 2022-09-27T16:39:02.6288899Z Running tests... 2022-09-27T16:39:02.6289351Z ---------------------------------------------------------------------- 2022-09-27T16:39:03.7806395Z test_all_gather_cuda_bfloat16 (__main__.TestNCCLCUDA) ... ok (1.151s) 2022-09-27T16:39:03.7840613Z test_all_gather_cuda_float32 (__main__.TestNCCLCUDA) ... ok (0.003s) 2022-09-27T16:39:03.7879296Z test_all_reduce_cuda_bfloat16 (__main__.TestNCCLCUDA) ... ok (0.004s) 2022-09-27T16:39:03.7917977Z test_all_reduce_cuda_float32 (__main__.TestNCCLCUDA) ... ok (0.004s) 2022-09-27T16:39:03.7950906Z test_broadcast_cuda_bfloat16 (__main__.TestNCCLCUDA) ... ok (0.003s) 2022-09-27T16:39:03.7982941Z test_broadcast_cuda_float32 (__main__.TestNCCLCUDA) ... ok (0.003s) 2022-09-27T16:39:03.8001744Z test_collective_errors_cuda (__main__.TestNCCLCUDA) ... ok (0.002s) 2022-09-27T16:39:03.8027978Z test_reduce_cuda_bfloat16 (__main__.TestNCCLCUDA) ... ok (0.003s) 2022-09-27T16:39:03.8054013Z test_reduce_cuda_float32 (__main__.TestNCCLCUDA) ... ok (0.002s) 2022-09-27T16:39:03.8091217Z test_reduce_scatter_cuda_bfloat16 (__main__.TestNCCLCUDA) ... ok (0.004s) 2022-09-27T16:39:03.8127267Z test_reduce_scatter_cuda_float32 (__main__.TestNCCLCUDA) ... ok (0.004s) 2022-09-27T16:39:03.8137662Z test_unique_id_cuda (__main__.TestNCCLCUDA) ... ok (0.001s) 2022-09-27T16:39:03.8138014Z 2022-09-27T16:39:03.8138474Z ---------------------------------------------------------------------- 2022-09-27T16:39:03.8138858Z Ran 12 tests in 1.185s 2022-09-27T16:39:03.8139037Z 2022-09-27T16:39:03.8139132Z OK 2022-09-27T16:39:03.8139310Z 2022-09-27T16:39:03.8139445Z Generating XML reports... 2022-09-27T16:39:03.8183426Z Generated XML report: test-reports/python-unittest/distributed.test_nccl/TEST-TestNCCLCUDA-20220927163902.xml 2022-09-27T16:39:04.2521448Z Running distributed/_shard/sharded_tensor/ops/test_math_ops ... [2022-09-27 16:39:04.251665] 2022-09-27T16:39:04.2522247Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_math_ops.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-09-27 16:39:04.251740] 2022-09-27T16:39:06.3353658Z Running distributed/elastic/events/lib_test ... [2022-09-27 16:39:06.334871] 2022-09-27T16:39:06.3354340Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/elastic/events/lib_test.py', '-v'] ... [2022-09-27 16:39:06.334948] 2022-09-27T16:39:07.2883356Z ============================= test session starts ============================== 2022-09-27T16:39:07.2883943Z platform linux -- Python 3.10.4, pytest-7.1.3, pluggy-1.0.0 -- /opt/conda/bin/python 2022-09-27T16:39:07.3009242Z cachedir: .pytest_cache 2022-09-27T16:39:07.3010117Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-09-27T16:39:07.3010637Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-09-27T16:39:07.3011178Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0, xdoctest-1.0.2 2022-09-27T16:39:08.9709461Z collecting ...  2022-09-27T16:39:08.9720560Z collecting 3 items  2022-09-27T16:39:08.9721016Z collected 8 items  2022-09-27T16:39:08.9725465Z 2022-09-27T16:39:08.9751626Z distributed/elastic/events/lib_test.py::EventLibTest::test_event_created PASSED [ 12%] 2022-09-27T16:39:08.9765575Z distributed/elastic/events/lib_test.py::EventLibTest::test_event_deser PASSED [ 25%] 2022-09-27T16:39:08.9783505Z distributed/elastic/events/lib_test.py::EventLibTest::test_get_or_create_logger PASSED [ 37%] 2022-09-27T16:39:09.0699185Z distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_construct_and_record_rdzv_event PASSED [ 50%] 2022-09-27T16:39:09.0718000Z distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_construct_and_record_rdzv_event_does_not_run_if_invalid_dest PASSED [ 62%] 2022-09-27T16:39:09.0730187Z distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_rdzv_event_created PASSED [ 75%] 2022-09-27T16:39:09.0743897Z distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_rdzv_event_deserialize PASSED [ 87%] 2022-09-27T16:39:09.0765907Z distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_rdzv_event_str PASSED [100%] 2022-09-27T16:39:09.0766900Z 2022-09-27T16:39:09.0767240Z ============================== 8 passed in 1.79s =============================== 2022-09-27T16:39:09.3528580Z Running distributed/pipeline/sync/skip/test_api ... [2022-09-27 16:39:09.352410] 2022-09-27T16:39:09.3529260Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_api.py', '-v'] ... [2022-09-27 16:39:09.352486] 2022-09-27T16:39:11.4084628Z ============================= test session starts ============================== 2022-09-27T16:39:11.4085184Z platform linux -- Python 3.10.4, pytest-7.1.3, pluggy-1.0.0 -- /opt/conda/bin/python 2022-09-27T16:39:11.4180008Z cachedir: .pytest_cache 2022-09-27T16:39:11.4180910Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-09-27T16:39:11.4181385Z torch: 1.13.0a0+git52424e2 2022-09-27T16:39:11.4181711Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-09-27T16:39:11.4182240Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0, xdoctest-1.0.2 2022-09-27T16:39:11.4345960Z collecting ...  2022-09-27T16:39:11.4346350Z collected 3 items  2022-09-27T16:39:11.4351708Z 2022-09-27T16:39:11.4383169Z distributed/pipeline/sync/skip/test_api.py::test_namespace_difference PASSED [ 33%] 2022-09-27T16:39:11.4399554Z distributed/pipeline/sync/skip/test_api.py::test_namespace_copy PASSED [ 66%] 2022-09-27T16:39:11.4633261Z distributed/pipeline/sync/skip/test_api.py::test_skippable_repr PASSED [100%] 2022-09-27T16:39:11.4634799Z 2022-09-27T16:39:11.4635299Z ============================== 3 passed in 0.06s =============================== 2022-09-27T16:39:11.7281102Z Running distributed/pipeline/sync/skip/test_leak ... [2022-09-27 16:39:11.727662] 2022-09-27T16:39:11.7282051Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_leak.py', '-v'] ... [2022-09-27 16:39:11.727737] 2022-09-27T16:39:13.7348943Z ============================= test session starts ============================== 2022-09-27T16:39:13.7349497Z platform linux -- Python 3.10.4, pytest-7.1.3, pluggy-1.0.0 -- /opt/conda/bin/python 2022-09-27T16:39:13.7445024Z cachedir: .pytest_cache 2022-09-27T16:39:13.7445891Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-09-27T16:39:13.7446338Z torch: 1.13.0a0+git52424e2 2022-09-27T16:39:13.7446664Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-09-27T16:39:13.7447186Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0, xdoctest-1.0.2 2022-09-27T16:39:13.7628538Z collecting ...  2022-09-27T16:39:13.7628957Z collected 8 items  2022-09-27T16:39:13.7633537Z 2022-09-27T16:39:13.8684134Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[always-train] PASSED [ 12%] 2022-09-27T16:39:13.8984633Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[always-eval] PASSED [ 25%] 2022-09-27T16:39:13.9294297Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[except_last-train] PASSED [ 37%] 2022-09-27T16:39:13.9579212Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[except_last-eval] PASSED [ 50%] 2022-09-27T16:39:13.9873936Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[never-train] PASSED [ 62%] 2022-09-27T16:39:14.0158130Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[never-eval] PASSED [ 75%] 2022-09-27T16:39:14.0415697Z distributed/pipeline/sync/skip/test_leak.py::test_no_portal_without_pipe[train] PASSED [ 87%] 2022-09-27T16:39:14.0674765Z distributed/pipeline/sync/skip/test_leak.py::test_no_portal_without_pipe[eval] PASSED [100%] 2022-09-27T16:39:14.0675506Z 2022-09-27T16:39:14.0675844Z ============================== 8 passed in 0.33s =============================== 2022-09-27T16:39:14.3251072Z Running distributed/pipeline/sync/skip/test_tracker ... [2022-09-27 16:39:14.324635] 2022-09-27T16:39:14.3251755Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_tracker.py', '-v'] ... [2022-09-27 16:39:14.324711] 2022-09-27T16:39:16.4253007Z ============================= test session starts ============================== 2022-09-27T16:39:16.4253579Z platform linux -- Python 3.10.4, pytest-7.1.3, pluggy-1.0.0 -- /opt/conda/bin/python 2022-09-27T16:39:16.4350519Z cachedir: .pytest_cache 2022-09-27T16:39:16.4351570Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-09-27T16:39:16.4352025Z torch: 1.13.0a0+git52424e2 2022-09-27T16:39:16.4352333Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-09-27T16:39:16.4352876Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0, xdoctest-1.0.2 2022-09-27T16:39:16.4836884Z collecting ...  2022-09-27T16:39:16.4837590Z collected 6 items  2022-09-27T16:39:16.4841685Z 2022-09-27T16:39:16.4876944Z distributed/pipeline/sync/skip/test_tracker.py::test_default_skip_tracker PASSED [ 16%] 2022-09-27T16:39:17.7446037Z distributed/pipeline/sync/skip/test_tracker.py::test_default_skip_tracker_by_data_parallel PASSED [ 33%] 2022-09-27T16:39:17.7460314Z distributed/pipeline/sync/skip/test_tracker.py::test_reuse_portal PASSED [ 50%] 2022-09-27T16:39:17.7473820Z distributed/pipeline/sync/skip/test_tracker.py::test_no_copy_no_portal PASSED [ 66%] 2022-09-27T16:39:17.7486757Z distributed/pipeline/sync/skip/test_tracker.py::test_tensor_life_without_checkpointing PASSED [ 83%] 2022-09-27T16:39:17.7503750Z distributed/pipeline/sync/skip/test_tracker.py::test_tensor_life_with_checkpointing PASSED [100%] 2022-09-27T16:39:17.7505274Z 2022-09-27T16:39:17.7505589Z ============================== 6 passed in 1.33s =============================== 2022-09-27T16:39:18.0701193Z Running distributed/pipeline/sync/test_bugs ... [2022-09-27 16:39:18.069623] 2022-09-27T16:39:18.0701851Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_bugs.py', '-v'] ... [2022-09-27 16:39:18.069698] 2022-09-27T16:39:20.1473428Z ============================= test session starts ============================== 2022-09-27T16:39:20.1474046Z platform linux -- Python 3.10.4, pytest-7.1.3, pluggy-1.0.0 -- /opt/conda/bin/python 2022-09-27T16:39:20.1568952Z cachedir: .pytest_cache 2022-09-27T16:39:20.1569699Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-09-27T16:39:20.1570125Z torch: 1.13.0a0+git52424e2 2022-09-27T16:39:20.1570449Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-09-27T16:39:20.1570978Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0, xdoctest-1.0.2 2022-09-27T16:39:20.1952011Z collecting ...  2022-09-27T16:39:20.1952605Z collected 4 items  2022-09-27T16:39:20.1956704Z 2022-09-27T16:39:20.2854664Z distributed/pipeline/sync/test_bugs.py::test_python_autograd_function PASSED [ 25%] 2022-09-27T16:39:20.3041361Z distributed/pipeline/sync/test_bugs.py::test_exception_no_hang PASSED [ 50%] 2022-09-27T16:39:23.7084843Z distributed/pipeline/sync/test_bugs.py::test_tuple_wait PASSED [ 75%] 2022-09-27T16:39:23.8577500Z distributed/pipeline/sync/test_bugs.py::test_parallel_randoms PASSED [100%] 2022-09-27T16:39:23.8577847Z 2022-09-27T16:39:23.8578153Z ============================== 4 passed in 3.71s =============================== 2022-09-27T16:39:24.2373315Z Running distributed/pipeline/sync/test_deferred_batch_norm ... [2022-09-27 16:39:24.236829] 2022-09-27T16:39:24.2374005Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_deferred_batch_norm.py', '-v'] ... [2022-09-27 16:39:24.236911] 2022-09-27T16:39:26.3279349Z ============================= test session starts ============================== 2022-09-27T16:39:26.3279932Z platform linux -- Python 3.10.4, pytest-7.1.3, pluggy-1.0.0 -- /opt/conda/bin/python 2022-09-27T16:39:26.3374745Z cachedir: .pytest_cache 2022-09-27T16:39:26.3375463Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-09-27T16:39:26.3375907Z torch: 1.13.0a0+git52424e2 2022-09-27T16:39:26.3376235Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-09-27T16:39:26.3376759Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0, xdoctest-1.0.2 2022-09-27T16:39:26.3690328Z collecting ...  2022-09-27T16:39:26.3690741Z collected 11 items  2022-09-27T16:39:26.3694473Z 2022-09-27T16:39:26.4536951Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[True-1] PASSED [ 9%] 2022-09-27T16:39:26.4963921Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[True-4] PASSED [ 18%] 2022-09-27T16:39:26.5339689Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[False-1] PASSED [ 27%] 2022-09-27T16:39:26.5706739Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[False-4] PASSED [ 36%] 2022-09-27T16:39:26.5975047Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_running_stats[0.1] PASSED [ 45%] 2022-09-27T16:39:26.6241669Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_running_stats[None] PASSED [ 54%] 2022-09-27T16:39:26.6262321Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_convert_deferred_batch_norm PASSED [ 63%] 2022-09-27T16:39:26.6583892Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_eval PASSED [ 72%] 2022-09-27T16:39:26.8045013Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_optimize PASSED [ 81%] 2022-09-27T16:39:26.8905496Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_conv_bn PASSED [ 90%] 2022-09-27T16:39:26.9152326Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_input_requiring_grad PASSED [100%] 2022-09-27T16:39:26.9152914Z 2022-09-27T16:39:26.9153245Z ============================== 11 passed in 0.59s ============================== 2022-09-27T16:39:27.1797598Z Running distributed/pipeline/sync/test_microbatch ... [2022-09-27 16:39:27.179310] 2022-09-27T16:39:27.1798313Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_microbatch.py', '-v'] ... [2022-09-27 16:39:27.179390] 2022-09-27T16:39:29.2587563Z ============================= test session starts ============================== 2022-09-27T16:39:29.2588124Z platform linux -- Python 3.10.4, pytest-7.1.3, pluggy-1.0.0 -- /opt/conda/bin/python 2022-09-27T16:39:29.2682498Z cachedir: .pytest_cache 2022-09-27T16:39:29.2683421Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-09-27T16:39:29.2683856Z torch: 1.13.0a0+git52424e2 2022-09-27T16:39:29.2684173Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-09-27T16:39:29.2684705Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0, xdoctest-1.0.2 2022-09-27T16:39:29.3013070Z collecting ...  2022-09-27T16:39:29.3013737Z collected 10 items  2022-09-27T16:39:29.3017973Z 2022-09-27T16:39:29.3049722Z distributed/pipeline/sync/test_microbatch.py::test_batch_atomic PASSED [ 10%] 2022-09-27T16:39:29.3066989Z distributed/pipeline/sync/test_microbatch.py::test_batch_non_atomic PASSED [ 20%] 2022-09-27T16:39:29.3084132Z distributed/pipeline/sync/test_microbatch.py::test_batch_call PASSED [ 30%] 2022-09-27T16:39:29.3101813Z distributed/pipeline/sync/test_microbatch.py::test_batch_setitem_by_index PASSED [ 40%] 2022-09-27T16:39:29.3119336Z distributed/pipeline/sync/test_microbatch.py::test_batch_setitem_by_slice PASSED [ 50%] 2022-09-27T16:39:29.3139698Z distributed/pipeline/sync/test_microbatch.py::test_check PASSED [ 60%] 2022-09-27T16:39:29.3334229Z distributed/pipeline/sync/test_microbatch.py::test_gather_tensors PASSED [ 70%] 2022-09-27T16:39:29.3351018Z distributed/pipeline/sync/test_microbatch.py::test_gather_tuples PASSED [ 80%] 2022-09-27T16:39:29.3368732Z distributed/pipeline/sync/test_microbatch.py::test_scatter_tensor PASSED [ 90%] 2022-09-27T16:39:29.3388738Z distributed/pipeline/sync/test_microbatch.py::test_scatter_multiple_tensors PASSED [100%] 2022-09-27T16:39:29.3390287Z 2022-09-27T16:39:29.3391140Z ============================== 10 passed in 0.08s ============================== 2022-09-27T16:39:29.5895416Z Running distributed/pipeline/sync/test_pipeline ... [2022-09-27 16:39:29.589067] 2022-09-27T16:39:29.5896403Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_pipeline.py', '-v'] ... [2022-09-27 16:39:29.589144] 2022-09-27T16:39:31.6463797Z ============================= test session starts ============================== 2022-09-27T16:39:31.6464404Z platform linux -- Python 3.10.4, pytest-7.1.3, pluggy-1.0.0 -- /opt/conda/bin/python 2022-09-27T16:39:31.6560161Z cachedir: .pytest_cache 2022-09-27T16:39:31.6560779Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-09-27T16:39:31.6561201Z torch: 1.13.0a0+git52424e2 2022-09-27T16:39:31.6561535Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-09-27T16:39:31.6562065Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0, xdoctest-1.0.2 2022-09-27T16:39:31.6727054Z collecting ...  2022-09-27T16:39:31.6727487Z collected 1 item  2022-09-27T16:39:31.6731906Z 2022-09-27T16:39:31.6767943Z distributed/pipeline/sync/test_pipeline.py::test_clock_cycles PASSED [100%] 2022-09-27T16:39:31.6769064Z 2022-09-27T16:39:31.6769378Z ============================== 1 passed in 0.03s =============================== 2022-09-27T16:39:31.9400783Z Running distributed/pipeline/sync/test_worker ... [2022-09-27 16:39:31.939584] 2022-09-27T16:39:31.9401423Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_worker.py', '-v'] ... [2022-09-27 16:39:31.939659] 2022-09-27T16:39:34.0449175Z ============================= test session starts ============================== 2022-09-27T16:39:34.0449739Z platform linux -- Python 3.10.4, pytest-7.1.3, pluggy-1.0.0 -- /opt/conda/bin/python 2022-09-27T16:39:34.0548179Z cachedir: .pytest_cache 2022-09-27T16:39:34.0549438Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-09-27T16:39:34.0550037Z torch: 1.13.0a0+git52424e2 2022-09-27T16:39:34.0550352Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-09-27T16:39:34.0551163Z plugins: hypothesis-5.35.1, forked-1.4.0, rerunfailures-10.2, xdist-2.5.0, xdoctest-1.0.2 2022-09-27T16:39:34.0786781Z collecting ...  2022-09-27T16:39:34.0787198Z collected 6 items  2022-09-27T16:39:34.0791777Z 2022-09-27T16:39:34.0829041Z distributed/pipeline/sync/test_worker.py::test_compute_multithreading PASSED [ 16%] 2022-09-27T16:39:34.0851619Z distributed/pipeline/sync/test_worker.py::test_compute_success PASSED [ 33%] 2022-09-27T16:39:34.0872892Z distributed/pipeline/sync/test_worker.py::test_compute_exception PASSED [ 50%] 2022-09-27T16:39:34.1091722Z distributed/pipeline/sync/test_worker.py::test_grad_mode[True] PASSED [ 66%] 2022-09-27T16:39:34.1113039Z distributed/pipeline/sync/test_worker.py::test_grad_mode[False] PASSED [ 83%] 2022-09-27T16:39:34.1139542Z distributed/pipeline/sync/test_worker.py::test_worker_per_device PASSED [100%] 2022-09-27T16:39:34.1141095Z 2022-09-27T16:39:34.1141481Z ============================== 6 passed in 0.07s =============================== 2022-09-27T16:39:34.6343521Z 2022-09-27T16:39:34.6343840Z real 52m43.222s 2022-09-27T16:39:34.6344164Z user 110m5.550s 2022-09-27T16:39:34.6344407Z sys 59m52.413s 2022-09-27T16:39:34.6344636Z + assert_git_not_dirty 2022-09-27T16:39:34.6345142Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 != *rocm* ]] 2022-09-27T16:39:34.6345560Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 != *xla* ]] 2022-09-27T16:39:34.6347250Z ++ git status --porcelain 2022-09-27T16:39:35.9918006Z + git_status= 2022-09-27T16:39:35.9918435Z + [[ -n '' ]] 2022-09-27T16:39:35.9918826Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *cuda* ]] 2022-09-27T16:39:35.9919137Z + [[ 2 == 1 ]] 2022-09-27T16:39:35.9919354Z + [[ 2 == 1 ]] 2022-09-27T16:39:35.9983918Z Prepare all required actions 2022-09-27T16:39:35.9984331Z Getting action download info 2022-09-27T16:39:36.2470511Z ##[group]Run ./.github/actions/get-workflow-job-id 2022-09-27T16:39:36.2471220Z with: 2022-09-27T16:39:36.2471996Z github-token: *** 2022-09-27T16:39:36.2472379Z env: 2022-09-27T16:39:36.2472628Z GIT_DEFAULT_BRANCH: master 2022-09-27T16:39:36.2472898Z GPU_FLAG: --gpus all 2022-09-27T16:39:36.2473128Z ##[endgroup] 2022-09-27T16:39:36.2508164Z ##[group]Run nick-fields/retry@7d4a37704547a311dbb66ebdf5b23ec19374a767 2022-09-27T16:39:36.2508487Z with: 2022-09-27T16:39:36.2508696Z shell: bash 2022-09-27T16:39:36.2508940Z timeout_minutes: 10 2022-09-27T16:39:36.2509189Z max_attempts: 5 2022-09-27T16:39:36.2509422Z retry_wait_seconds: 30 2022-09-27T16:39:36.2510110Z command: set -eux python3 -m pip install requests==2.26.0 GHA_WORKFLOW_JOB_ID=$(python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}") echo "::set-output name=job-id::${GHA_WORKFLOW_JOB_ID}" 2022-09-27T16:39:36.2510617Z polling_interval_seconds: 1 2022-09-27T16:39:36.2511355Z warning_on_retry: true 2022-09-27T16:39:36.2511854Z continue_on_error: false 2022-09-27T16:39:36.2512297Z env: 2022-09-27T16:39:36.2512530Z GIT_DEFAULT_BRANCH: master 2022-09-27T16:39:36.2512799Z GPU_FLAG: --gpus all 2022-09-27T16:39:36.2513196Z GITHUB_TOKEN: *** 2022-09-27T16:39:36.2513448Z ##[endgroup] 2022-09-27T16:39:36.3073686Z 2022-09-27T16:39:36.3137726Z + python3 -m pip install requests==2.26.0 2022-09-27T16:39:36.5980996Z Defaulting to user installation because normal site-packages is not writeable 2022-09-27T16:39:36.7348616Z Collecting requests==2.26.0 2022-09-27T16:39:36.7532674Z Downloading requests-2.26.0-py2.py3-none-any.whl (62 kB) 2022-09-27T16:39:36.8774561Z Collecting urllib3<1.27,>=1.21.1 2022-09-27T16:39:36.8817497Z Downloading urllib3-1.26.12-py2.py3-none-any.whl (140 kB) 2022-09-27T16:39:37.0209602Z Collecting charset-normalizer~=2.0.0; python_version >= "3" 2022-09-27T16:39:37.0252263Z Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB) 2022-09-27T16:39:37.0693661Z Collecting idna<4,>=2.5; python_version >= "3" 2022-09-27T16:39:37.0747789Z Downloading idna-3.4-py3-none-any.whl (61 kB) 2022-09-27T16:39:37.1381725Z Collecting certifi>=2017.4.17 2022-09-27T16:39:37.1429203Z Downloading certifi-2022.9.24-py3-none-any.whl (161 kB) 2022-09-27T16:39:37.2321971Z Installing collected packages: urllib3, charset-normalizer, idna, certifi, requests 2022-09-27T16:39:37.3417320Z WARNING: The script normalizer is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-09-27T16:39:37.3417987Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-09-27T16:39:37.4846464Z Successfully installed certifi-2022.9.24 charset-normalizer-2.0.12 idna-3.4 requests-2.26.0 urllib3-1.26.12 2022-09-27T16:39:37.5326024Z ++ python3 .github/scripts/get_workflow_job_id.py 3133193930 i-0f5565a17788248fc 2022-09-27T16:39:41.2068870Z + GHA_WORKFLOW_JOB_ID=8576432338 2022-09-27T16:39:41.2069457Z + echo '::set-output name=job-id::8576432338' 2022-09-27T16:39:41.3169908Z Command completed after 1 attempt(s). 2022-09-27T16:39:41.3170224Z 2022-09-27T16:39:41.3306043Z ##[group]Run kill "$MONITOR_SCRIPT_PID" 2022-09-27T16:39:41.3306371Z kill "$MONITOR_SCRIPT_PID" 2022-09-27T16:39:41.3319940Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-09-27T16:39:41.3320240Z env: 2022-09-27T16:39:41.3320489Z GIT_DEFAULT_BRANCH: master 2022-09-27T16:39:41.3320739Z GPU_FLAG: --gpus all 2022-09-27T16:39:41.3321003Z MONITOR_SCRIPT_PID: 56742 2022-09-27T16:39:41.3321262Z ##[endgroup] 2022-09-27T16:39:41.3412080Z Prepare all required actions 2022-09-27T16:39:41.3412439Z Getting action download info 2022-09-27T16:39:41.5216538Z Download action repository 'actions/upload-artifact@v2' (SHA:82c141cc518b40d92cc801eee768e7aafc9c2fa2) 2022-09-27T16:39:41.6773399Z ##[group]Run ./.github/actions/upload-test-artifacts 2022-09-27T16:39:41.6773694Z with: 2022-09-27T16:39:41.6774050Z file-suffix: test-distributed-2-3-linux.8xlarge.nvidia.gpu_8576432338 2022-09-27T16:39:41.6774377Z env: 2022-09-27T16:39:41.6774612Z GIT_DEFAULT_BRANCH: master 2022-09-27T16:39:41.6774881Z GPU_FLAG: --gpus all 2022-09-27T16:39:41.6775111Z ##[endgroup] 2022-09-27T16:39:41.6806532Z ##[group]Run # Remove any previous test jsons if they exist 2022-09-27T16:39:41.6806907Z # Remove any previous test jsons if they exist 2022-09-27T16:39:41.6807218Z rm -f test-jsons-*.zip 2022-09-27T16:39:41.6807529Z zip -r "test-jsons-${FILE_SUFFIX}.zip" test -i '*.json' 2022-09-27T16:39:41.6819255Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-09-27T16:39:41.6819548Z env: 2022-09-27T16:39:41.6819897Z GIT_DEFAULT_BRANCH: master 2022-09-27T16:39:41.6820161Z GPU_FLAG: --gpus all 2022-09-27T16:39:41.6820526Z FILE_SUFFIX: test-distributed-2-3-linux.8xlarge.nvidia.gpu_8576432338 2022-09-27T16:39:41.6820862Z ##[endgroup] 2022-09-27T16:39:41.6953181Z adding: test/allowlist_for_publicAPI.json (deflated 80%) 2022-09-27T16:39:41.6989231Z adding: test/benchmark_utils/callgrind_artifacts.json (deflated 92%) 2022-09-27T16:39:41.6998034Z adding: test/profiler/profiler_utils_mock_events.json (deflated 87%) 2022-09-27T16:39:41.6999242Z adding: test/.pytorch-slow-tests.json (deflated 75%) 2022-09-27T16:39:41.7006801Z adding: test/.pytorch-disabled-tests.json (deflated 85%) 2022-09-27T16:39:41.7033120Z ##[group]Run # Remove any previous test reports if they exist 2022-09-27T16:39:41.7033626Z # Remove any previous test reports if they exist 2022-09-27T16:39:41.7033962Z rm -f test-reports-*.zip 2022-09-27T16:39:41.7034290Z zip -r "test-reports-${FILE_SUFFIX}.zip" test -i '*.xml' 2022-09-27T16:39:41.7046996Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-09-27T16:39:41.7047292Z env: 2022-09-27T16:39:41.7047540Z GIT_DEFAULT_BRANCH: master 2022-09-27T16:39:41.7047794Z GPU_FLAG: --gpus all 2022-09-27T16:39:41.7048165Z FILE_SUFFIX: test-distributed-2-3-linux.8xlarge.nvidia.gpu_8576432338 2022-09-27T16:39:41.7048519Z ##[endgroup] 2022-09-27T16:39:41.7149346Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_optim_state/TEST-TestFSDPOptimState-20220927154655.xml (deflated 93%) 2022-09-27T16:39:41.7150192Z adding: test/test-reports/python-unittest/distributed.fsdp.test_checkpoint_wrapper/TEST-CheckpointWrapperTest-20220927155037.xml (deflated 68%) 2022-09-27T16:39:41.7151722Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDdpComparisonTest-20220927155044.xml (deflated 41%) 2022-09-27T16:39:41.7152670Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20220927155051.xml (deflated 41%) 2022-09-27T16:39:41.7153567Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20220927155100.xml (deflated 41%) 2022-09-27T16:39:41.7154467Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaDistAutogradTest-20220927155109.xml (deflated 41%) 2022-09-27T16:39:41.7155358Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20220927155118.xml (deflated 41%) 2022-09-27T16:39:41.7156251Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20220927155127.xml (deflated 40%) 2022-09-27T16:39:41.7157266Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20220927155135.xml (deflated 40%) 2022-09-27T16:39:41.7158178Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRemoteModuleTest-20220927155142.xml (deflated 41%) 2022-09-27T16:39:41.7159038Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeCudaRpcTest-20220927155151.xml (deflated 40%) 2022-09-27T16:39:41.7159883Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220927155202.xml (deflated 40%) 2022-09-27T16:39:41.7160739Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220927155209.xml (deflated 40%) 2022-09-27T16:39:41.7161600Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220927155216.xml (deflated 40%) 2022-09-27T16:39:41.7162457Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220927155222.xml (deflated 40%) 2022-09-27T16:39:41.7163422Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220927155229.xml (deflated 40%) 2022-09-27T16:39:41.7164275Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220927155236.xml (deflated 40%) 2022-09-27T16:39:41.7165110Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220927155243.xml (deflated 40%) 2022-09-27T16:39:41.7165996Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipePipeWithDDPTest-20220927155249.xml (deflated 40%) 2022-09-27T16:39:41.7167499Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155256.xml (deflated 42%) 2022-09-27T16:39:41.7168483Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155308.xml (deflated 42%) 2022-09-27T16:39:41.7169441Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155322.xml (deflated 42%) 2022-09-27T16:39:41.7170374Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155335.xml (deflated 43%) 2022-09-27T16:39:41.7171534Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155348.xml (deflated 43%) 2022-09-27T16:39:41.7172509Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155359.xml (deflated 43%) 2022-09-27T16:39:41.7173496Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155411.xml (deflated 43%) 2022-09-27T16:39:41.7174445Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155423.xml (deflated 42%) 2022-09-27T16:39:41.7175673Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155435.xml (deflated 43%) 2022-09-27T16:39:41.7176647Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155447.xml (deflated 43%) 2022-09-27T16:39:41.7177587Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155454.xml (deflated 43%) 2022-09-27T16:39:41.7178627Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155501.xml (deflated 43%) 2022-09-27T16:39:41.7179595Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155508.xml (deflated 43%) 2022-09-27T16:39:41.7180508Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155515.xml (deflated 43%) 2022-09-27T16:39:41.7181452Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155523.xml (deflated 43%) 2022-09-27T16:39:41.7182400Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155531.xml (deflated 42%) 2022-09-27T16:39:41.7183346Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155545.xml (deflated 43%) 2022-09-27T16:39:41.7184272Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155549.xml (deflated 42%) 2022-09-27T16:39:41.7185300Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155611.xml (deflated 43%) 2022-09-27T16:39:41.7186239Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155627.xml (deflated 42%) 2022-09-27T16:39:41.7187182Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155641.xml (deflated 42%) 2022-09-27T16:39:41.7188119Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155649.xml (deflated 42%) 2022-09-27T16:39:41.7189054Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155659.xml (deflated 42%) 2022-09-27T16:39:41.7190045Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155709.xml (deflated 42%) 2022-09-27T16:39:41.7191355Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155719.xml (deflated 43%) 2022-09-27T16:39:41.7192314Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155723.xml (deflated 42%) 2022-09-27T16:39:41.7193247Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155736.xml (deflated 42%) 2022-09-27T16:39:41.7194178Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155748.xml (deflated 42%) 2022-09-27T16:39:41.7195127Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155800.xml (deflated 42%) 2022-09-27T16:39:41.7196077Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155813.xml (deflated 42%) 2022-09-27T16:39:41.7197018Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155825.xml (deflated 42%) 2022-09-27T16:39:41.7197942Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155838.xml (deflated 42%) 2022-09-27T16:39:41.7199015Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155850.xml (deflated 42%) 2022-09-27T16:39:41.7199990Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155902.xml (deflated 42%) 2022-09-27T16:39:41.7200933Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155915.xml (deflated 42%) 2022-09-27T16:39:41.7201880Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155927.xml (deflated 42%) 2022-09-27T16:39:41.7202800Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155939.xml (deflated 42%) 2022-09-27T16:39:41.7203743Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927155952.xml (deflated 42%) 2022-09-27T16:39:41.7204696Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160004.xml (deflated 42%) 2022-09-27T16:39:41.7205721Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160017.xml (deflated 42%) 2022-09-27T16:39:41.7206646Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160029.xml (deflated 42%) 2022-09-27T16:39:41.7207575Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160041.xml (deflated 42%) 2022-09-27T16:39:41.7208520Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160052.xml (deflated 43%) 2022-09-27T16:39:41.7209465Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160104.xml (deflated 42%) 2022-09-27T16:39:41.7210407Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160114.xml (deflated 42%) 2022-09-27T16:39:41.7211331Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160125.xml (deflated 42%) 2022-09-27T16:39:41.7212269Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160137.xml (deflated 42%) 2022-09-27T16:39:41.7213210Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160149.xml (deflated 43%) 2022-09-27T16:39:41.7214153Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160157.xml (deflated 43%) 2022-09-27T16:39:41.7215078Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160204.xml (deflated 43%) 2022-09-27T16:39:41.7216025Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160211.xml (deflated 42%) 2022-09-27T16:39:41.7216967Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160218.xml (deflated 42%) 2022-09-27T16:39:41.7217906Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160227.xml (deflated 42%) 2022-09-27T16:39:41.7218844Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160236.xml (deflated 42%) 2022-09-27T16:39:41.7219824Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160245.xml (deflated 42%) 2022-09-27T16:39:41.7220789Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160254.xml (deflated 42%) 2022-09-27T16:39:41.7221730Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160303.xml (deflated 42%) 2022-09-27T16:39:41.7222668Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160312.xml (deflated 42%) 2022-09-27T16:39:41.7223589Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160321.xml (deflated 42%) 2022-09-27T16:39:41.7224531Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160333.xml (deflated 42%) 2022-09-27T16:39:41.7225563Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160346.xml (deflated 42%) 2022-09-27T16:39:41.7226499Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160353.xml (deflated 42%) 2022-09-27T16:39:41.7227474Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160405.xml (deflated 43%) 2022-09-27T16:39:41.7228394Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160412.xml (deflated 43%) 2022-09-27T16:39:41.7229342Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160419.xml (deflated 42%) 2022-09-27T16:39:41.7230286Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160426.xml (deflated 42%) 2022-09-27T16:39:41.7231850Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160436.xml (deflated 42%) 2022-09-27T16:39:41.7232793Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160443.xml (deflated 42%) 2022-09-27T16:39:41.7233724Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160450.xml (deflated 42%) 2022-09-27T16:39:41.7234662Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160503.xml (deflated 42%) 2022-09-27T16:39:41.7235607Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160518.xml (deflated 41%) 2022-09-27T16:39:41.7236549Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160533.xml (deflated 41%) 2022-09-27T16:39:41.7237521Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160546.xml (deflated 42%) 2022-09-27T16:39:41.7238471Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160606.xml (deflated 42%) 2022-09-27T16:39:41.7239411Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160629.xml (deflated 42%) 2022-09-27T16:39:41.7240462Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160633.xml (deflated 42%) 2022-09-27T16:39:41.7241412Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160655.xml (deflated 42%) 2022-09-27T16:39:41.7242352Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160715.xml (deflated 42%) 2022-09-27T16:39:41.7243299Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160734.xml (deflated 42%) 2022-09-27T16:39:41.7244237Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160754.xml (deflated 42%) 2022-09-27T16:39:41.7245178Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160812.xml (deflated 42%) 2022-09-27T16:39:41.7246096Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160831.xml (deflated 41%) 2022-09-27T16:39:41.7247117Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160852.xml (deflated 42%) 2022-09-27T16:39:41.7248057Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160914.xml (deflated 41%) 2022-09-27T16:39:41.7249000Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160934.xml (deflated 41%) 2022-09-27T16:39:41.7249924Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927160957.xml (deflated 42%) 2022-09-27T16:39:41.7250866Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeAgentCudaRpcTest-20220927161007.xml (deflated 43%) 2022-09-27T16:39:41.7251837Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20220927161019.xml (deflated 44%) 2022-09-27T16:39:41.7252821Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20220927161026.xml (deflated 44%) 2022-09-27T16:39:41.7253803Z adding: test/test-reports/python-unittest/distributed.rpc.cuda.test_tensorpipe_agent/TEST-TensorPipeTensorPipeCudaDistAutogradTest-20220927161033.xml (deflated 43%) 2022-09-27T16:39:41.7254615Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestHooks-20220927161040.xml (deflated 79%) 2022-09-27T16:39:41.7255319Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestNoGrad-20220927161040.xml (deflated 63%) 2022-09-27T16:39:41.7256040Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParamInit-20220927161040.xml (deflated 61%) 2022-09-27T16:39:41.7256790Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParityWithDDP-20220927161040.xml (deflated 91%) 2022-09-27T16:39:41.7257542Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_state_dict/TEST-TestFSDPStateDict-20220927161656.xml (deflated 94%) 2022-09-27T16:39:41.7258375Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestCreateTensorFromParams-20220927162134.xml (deflated 43%) 2022-09-27T16:39:41.7259249Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorMetadata-20220927162134.xml (deflated 43%) 2022-09-27T16:39:41.7260137Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestLocalTensor-20220927162134.xml (deflated 60%) 2022-09-27T16:39:41.7261237Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestModuleHookApi-20220927162134.xml (deflated 57%) 2022-09-27T16:39:41.7262046Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardMetadata-20220927162134.xml (deflated 58%) 2022-09-27T16:39:41.7262856Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardParameter-20220927162134.xml (deflated 60%) 2022-09-27T16:39:41.7263657Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardTensor-20220927162134.xml (deflated 61%) 2022-09-27T16:39:41.7264483Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorChunked-20220927162134.xml (deflated 88%) 2022-09-27T16:39:41.7265344Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorCustomOps-20220927162134.xml (deflated 69%) 2022-09-27T16:39:41.7266304Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorEnumerable-20220927162134.xml (deflated 87%) 2022-09-27T16:39:41.7267210Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorFromLocalShards-20220927162134.xml (deflated 82%) 2022-09-27T16:39:41.7268129Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor/TEST-TestShardedTensorFromLocalTensor-20220927162134.xml (deflated 61%) 2022-09-27T16:39:41.7268944Z adding: test/test-reports/python-unittest/distributed.test_c10d_pypg/TEST-TestDDPWithWorkSubclass-20220927162401.xml (deflated 84%) 2022-09-27T16:39:41.7269724Z adding: test/test-reports/python-unittest/distributed.test_c10d_pypg/TEST-TestDDPWithWorkWrapper-20220927162401.xml (deflated 84%) 2022-09-27T16:39:41.7270462Z adding: test/test-reports/python-unittest/distributed.fsdp.test_wrap/TEST-TestAutoWrap-20220927162621.xml (deflated 81%) 2022-09-27T16:39:41.7271470Z adding: test/test-reports/python-unittest/distributed.fsdp.test_wrap/TEST-TestFSDPWrap-20220927162621.xml (deflated 89%) 2022-09-27T16:39:41.7272159Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_misc/TEST-TestFSDPMisc-20220927162759.xml (deflated 77%) 2022-09-27T16:39:41.7272878Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_grad_acc/TEST-TestGradAcc-20220927162905.xml (deflated 93%) 2022-09-27T16:39:41.7273705Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163005.xml (deflated 41%) 2022-09-27T16:39:41.7274550Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163014.xml (deflated 41%) 2022-09-27T16:39:41.7275388Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163023.xml (deflated 42%) 2022-09-27T16:39:41.7276213Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163032.xml (deflated 42%) 2022-09-27T16:39:41.7277050Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163040.xml (deflated 42%) 2022-09-27T16:39:41.7277883Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163049.xml (deflated 42%) 2022-09-27T16:39:41.7278716Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163058.xml (deflated 42%) 2022-09-27T16:39:41.7279618Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163107.xml (deflated 42%) 2022-09-27T16:39:41.7280468Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20220927163115.xml (deflated 42%) 2022-09-27T16:39:41.7281289Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_freezing_weights/TEST-TestFreezingWeights-20220927163123.xml (deflated 84%) 2022-09-27T16:39:41.7282057Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_comm/TEST-TestCommunication-20220927163209.xml (deflated 91%) 2022-09-27T16:39:41.7282805Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_exec_order/TEST-TestFSDPExecOrder-20220927163248.xml (deflated 83%) 2022-09-27T16:39:41.7283582Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_checkpoint/TEST-TestFSDPCheckpoint-20220927163326.xml (deflated 78%) 2022-09-27T16:39:41.7284374Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_meta/TEST-TestFSDPWithMetaDevice-20220927163400.xml (deflated 86%) 2022-09-27T16:39:41.7285214Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_matrix_ops/TEST-TestShardedTensorMatrixOps-20220927163433.xml (deflated 86%) 2022-09-27T16:39:41.7286129Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_ignored_modules/TEST-TestFSDPIgnoredModules-20220927163503.xml (deflated 75%) 2022-09-27T16:39:41.7287016Z adding: test/test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint/TEST-TestDistributedReshardOnLoad-20220927163528.xml (deflated 68%) 2022-09-27T16:39:41.7287952Z adding: test/test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint/TEST-TestDistributedStateDictSaveLoad-20220927163528.xml (deflated 42%) 2022-09-27T16:39:41.7288984Z adding: test/test-reports/python-unittest/distributed._shard.checkpoint.test_file_system_checkpoint/TEST-TestDistributedStateDictSaveLoadWithSharedTensor-20220927163528.xml (deflated 44%) 2022-09-27T16:39:41.7289881Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_memory/TEST-TestFSDPMemory-20220927163552.xml (deflated 55%) 2022-09-27T16:39:41.7290646Z adding: test/test-reports/python-unittest/distributed._shard.sharding_plan.test_sharding_plan/TEST-TestShardingPlan-20220927163610.xml (deflated 76%) 2022-09-27T16:39:41.7291433Z adding: test/test-reports/python-unittest/distributed._shard.test_partial_tensor/TEST-TestPartialTensorOps-20220927163625.xml (deflated 67%) 2022-09-27T16:39:41.7292232Z adding: test/test-reports/python-unittest/distributed._shard.test_partial_tensor/TEST-TestPartialTensorReshard-20220927163625.xml (deflated 60%) 2022-09-27T16:39:41.7292987Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_apply/TEST-TestApply-20220927163641.xml (deflated 60%) 2022-09-27T16:39:41.7293780Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_binary_cmp/TEST-TestShardedTensorBinaryOps-20220927163656.xml (deflated 74%) 2022-09-27T16:39:41.7294558Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_input/TEST-TestInput-20220927163709.xml (deflated 57%) 2022-09-27T16:39:41.7295360Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_linear/TEST-TestShardedTensorOpsLinear-20220927163721.xml (deflated 68%) 2022-09-27T16:39:41.7296212Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_init/TEST-TestShardedTensorNNInit-20220927163732.xml (deflated 69%) 2022-09-27T16:39:41.7297010Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.test_sharded_tensor_reshard/TEST-TestReshard-20220927163743.xml (deflated 61%) 2022-09-27T16:39:41.7297845Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_embedding_bag/TEST-TestShardedEmbeddingBag-20220927163751.xml (deflated 60%) 2022-09-27T16:39:41.7298715Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_multiple_forward/TEST-TestMultiForward-20220927163759.xml (deflated 41%) 2022-09-27T16:39:41.7299486Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_pure_fp16/TEST-TestPureFP16-20220927163807.xml (deflated 55%) 2022-09-27T16:39:41.7300252Z adding: test/test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-LocalTimerServerTest-20220927163815.xml (deflated 71%) 2022-09-27T16:39:41.7301054Z adding: test/test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-LocalTimerTest-20220927163815.xml (deflated 69%) 2022-09-27T16:39:41.7301911Z adding: test/test-reports/python-unittest/distributed.elastic.timer.local_timer_test/TEST-MultiprocessingRequestQueueTest-20220927163815.xml (deflated 66%) 2022-09-27T16:39:41.7302727Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_traversal/TEST-TestTraversal-20220927163823.xml (deflated 40%) 2022-09-27T16:39:41.7303520Z adding: test/test-reports/python-unittest/distributed.elastic.utils.distributed_test/TEST-DistributedUtilTest-20220927163831.xml (deflated 71%) 2022-09-27T16:39:41.7304401Z adding: test/test-reports/python-unittest/distributed._shard.sharded_optim.test_sharded_optim/TEST-TestShardedOptimizer-20220927163838.xml (deflated 52%) 2022-09-27T16:39:41.7305209Z adding: test/test-reports/python-unittest/distributed.fsdp.test_flatten_params_wrapper/TEST-TestFlattenParams-20220927163844.xml (deflated 81%) 2022-09-27T16:39:41.7306013Z adding: test/test-reports/python-unittest/distributed.fsdp.test_flatten_params_wrapper/TEST-TestFlattenParamsCUDA-20220927163844.xml (deflated 80%) 2022-09-27T16:39:41.7306852Z adding: test/test-reports/python-unittest/distributed.fsdp.test_flatten_params_wrapper/TEST-TestFlattenParamsCUDAHalf-20220927163844.xml (deflated 81%) 2022-09-27T16:39:41.7307637Z adding: test/test-reports/python-unittest/distributed.elastic.utils.logging_test/TEST-LoggingTest-20220927163849.xml (deflated 53%) 2022-09-27T16:39:41.7308392Z adding: test/test-reports/python-unittest/distributed.test_launcher/TEST-TestDistributedLaunch-20220927163853.xml (deflated 42%) 2022-09-27T16:39:41.7309172Z adding: test/test-reports/python-unittest/distributed._shard.checkpoint.test_utils/TEST-TestMedatadaIndex-20220927163857.xml (deflated 72%) 2022-09-27T16:39:41.7309890Z adding: test/test-reports/python-unittest/distributed.test_nccl/TEST-TestNCCLCUDA-20220927163902.xml (deflated 83%) 2022-09-27T16:39:41.7347979Z ##[group]Run # Remove any previous test reports if they exist 2022-09-27T16:39:41.7348371Z # Remove any previous test reports if they exist 2022-09-27T16:39:41.7348691Z rm -f usage-log-*.zip 2022-09-27T16:39:41.7349045Z # this workflow is also run in bazel build test, but we dont generate usage reports for it 2022-09-27T16:39:41.7349437Z # so check to see if the file exists first 2022-09-27T16:39:41.7349749Z if [ -f 'usage_log.txt' ]; then 2022-09-27T16:39:41.7350089Z  zip "usage-log-${FILE_SUFFIX}.zip" 'usage_log.txt' 2022-09-27T16:39:41.7350365Z fi 2022-09-27T16:39:41.7363552Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-09-27T16:39:41.7363847Z env: 2022-09-27T16:39:41.7364073Z GIT_DEFAULT_BRANCH: master 2022-09-27T16:39:41.7364342Z GPU_FLAG: --gpus all 2022-09-27T16:39:41.7364715Z FILE_SUFFIX: test-distributed-2-3-linux.8xlarge.nvidia.gpu_8576432338 2022-09-27T16:39:41.7365055Z ##[endgroup] 2022-09-27T16:39:41.7871488Z adding: usage_log.txt (deflated 94%) 2022-09-27T16:39:41.7917288Z ##[group]Run seemethere/upload-artifact-s3@v5 2022-09-27T16:39:41.7917578Z with: 2022-09-27T16:39:41.7917858Z s3-prefix: pytorch/pytorch/3133193930/2/artifact 2022-09-27T16:39:41.7918138Z retention-days: 14 2022-09-27T16:39:41.7918400Z if-no-files-found: warn 2022-09-27T16:39:41.7918676Z path: test-jsons-*.zip 2022-09-27T16:39:41.7918927Z name: artifact 2022-09-27T16:39:41.7919159Z s3-bucket: gha-artifacts 2022-09-27T16:39:41.7919419Z region: us-east-1 2022-09-27T16:39:41.7919751Z env: 2022-09-27T16:39:41.7919984Z GIT_DEFAULT_BRANCH: master 2022-09-27T16:39:41.7920252Z GPU_FLAG: --gpus all 2022-09-27T16:39:41.7920497Z ##[endgroup] 2022-09-27T16:39:42.1986473Z NOTE: s3-prefix specified, ignoring name parameter 2022-09-27T16:39:42.1986879Z With the provided path, there will be 1 file uploaded 2022-09-27T16:39:42.1987230Z Uploading to s3 prefix: pytorch/pytorch/3133193930/2/artifact 2022-09-27T16:39:42.1998426Z Starting upload of test-jsons-test-distributed-2-3-linux.8xlarge.nvidia.gpu_8576432338.zip 2022-09-27T16:39:42.3314447Z Finished upload of test-jsons-test-distributed-2-3-linux.8xlarge.nvidia.gpu_8576432338.zip 2022-09-27T16:39:42.3451456Z ##[group]Run seemethere/upload-artifact-s3@v5 2022-09-27T16:39:42.3451755Z with: 2022-09-27T16:39:42.3452040Z s3-prefix: pytorch/pytorch/3133193930/2/artifact 2022-09-27T16:39:42.3452324Z retention-days: 14 2022-09-27T16:39:42.3452597Z if-no-files-found: error 2022-09-27T16:39:42.3452882Z path: test-reports-*.zip 2022-09-27T16:39:42.3453135Z name: artifact 2022-09-27T16:39:42.3453393Z s3-bucket: gha-artifacts 2022-09-27T16:39:42.3453788Z region: us-east-1 2022-09-27T16:39:42.3454019Z env: 2022-09-27T16:39:42.3454242Z GIT_DEFAULT_BRANCH: master 2022-09-27T16:39:42.3454508Z GPU_FLAG: --gpus all 2022-09-27T16:39:42.3454759Z ##[endgroup] 2022-09-27T16:39:42.7401163Z NOTE: s3-prefix specified, ignoring name parameter 2022-09-27T16:39:42.7402043Z With the provided path, there will be 1 file uploaded 2022-09-27T16:39:42.7402557Z Uploading to s3 prefix: pytorch/pytorch/3133193930/2/artifact 2022-09-27T16:39:42.7412086Z Starting upload of test-reports-test-distributed-2-3-linux.8xlarge.nvidia.gpu_8576432338.zip 2022-09-27T16:39:42.9226293Z Finished upload of test-reports-test-distributed-2-3-linux.8xlarge.nvidia.gpu_8576432338.zip 2022-09-27T16:39:42.9363687Z ##[group]Run seemethere/upload-artifact-s3@v5 2022-09-27T16:39:42.9363983Z with: 2022-09-27T16:39:42.9364267Z s3-prefix: pytorch/pytorch/3133193930/2/artifact 2022-09-27T16:39:42.9364557Z retention-days: 14 2022-09-27T16:39:42.9364830Z if-no-files-found: ignore 2022-09-27T16:39:42.9365124Z path: usage-log-*.zip 2022-09-27T16:39:42.9365357Z name: artifact 2022-09-27T16:39:42.9365609Z s3-bucket: gha-artifacts 2022-09-27T16:39:42.9365869Z region: us-east-1 2022-09-27T16:39:42.9366097Z env: 2022-09-27T16:39:42.9366317Z GIT_DEFAULT_BRANCH: master 2022-09-27T16:39:42.9366581Z GPU_FLAG: --gpus all 2022-09-27T16:39:42.9366826Z ##[endgroup] 2022-09-27T16:39:43.3317257Z NOTE: s3-prefix specified, ignoring name parameter 2022-09-27T16:39:43.3317678Z With the provided path, there will be 1 file uploaded 2022-09-27T16:39:43.3318051Z Uploading to s3 prefix: pytorch/pytorch/3133193930/2/artifact 2022-09-27T16:39:43.3327870Z Starting upload of usage-log-test-distributed-2-3-linux.8xlarge.nvidia.gpu_8576432338.zip 2022-09-27T16:39:43.5306043Z Finished upload of usage-log-test-distributed-2-3-linux.8xlarge.nvidia.gpu_8576432338.zip 2022-09-27T16:39:43.5450860Z ##[group]Run set -x 2022-09-27T16:39:43.5451149Z set -x 2022-09-27T16:39:43.5451453Z python3 -m pip install -r requirements.txt 2022-09-27T16:39:43.5451812Z python3 -m pip install boto3==1.19.12 2022-09-27T16:39:43.5452214Z python3 -m tools.stats.print_test_stats --upload-to-s3 --compare-with-s3 test 2022-09-27T16:39:43.5465500Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-09-27T16:39:43.5465800Z env: 2022-09-27T16:39:43.5466045Z GIT_DEFAULT_BRANCH: master 2022-09-27T16:39:43.5466299Z GPU_FLAG: --gpus all 2022-09-27T16:39:43.5466571Z AWS_DEFAULT_REGION: us-east-1 2022-09-27T16:39:43.5466840Z BRANCH: pull/85462 2022-09-27T16:39:43.5467082Z TEST_CONFIG: distributed 2022-09-27T16:39:43.5467339Z SHARD_NUMBER: 2 2022-09-27T16:39:43.5467656Z BUILD_ENVIRONMENT: linux-bionic-cuda11.6-py3.10-gcc7 2022-09-27T16:39:43.5467954Z PR_NUMBER: 85462 2022-09-27T16:39:43.5468215Z PYTORCH_RETRY_TEST_CASES: 1 2022-09-27T16:39:43.5468543Z PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1 2022-09-27T16:39:43.5468837Z SHA1: 52424e2bf38e454d535881fed9628d3e20f4f944 2022-09-27T16:39:43.5469112Z TAG: 2022-09-27T16:39:43.5469344Z WORKFLOW_ID: 3133193930 2022-09-27T16:39:43.5469759Z GITHUB_TOKEN: *** 2022-09-27T16:39:43.5470013Z GHA_WORKFLOW_JOB_ID: 8576432338 2022-09-27T16:39:43.5470275Z ##[endgroup] 2022-09-27T16:39:43.5500696Z + python3 -m pip install -r requirements.txt 2022-09-27T16:39:43.8420128Z Defaulting to user installation because normal site-packages is not writeable 2022-09-27T16:39:43.9244921Z Collecting astunparse 2022-09-27T16:39:43.9414424Z Downloading astunparse-1.6.3-py2.py3-none-any.whl (12 kB) 2022-09-27T16:39:43.9743319Z Collecting expecttest 2022-09-27T16:39:43.9800115Z Downloading expecttest-0.1.3-py3-none-any.whl (6.5 kB) 2022-09-27T16:39:44.0196627Z Collecting future 2022-09-27T16:39:44.0239280Z Downloading future-0.18.2.tar.gz (829 kB) 2022-09-27T16:39:45.9084342Z Collecting hypothesis 2022-09-27T16:39:45.9181454Z Downloading hypothesis-6.54.6-py3-none-any.whl (390 kB) 2022-09-27T16:39:46.7123709Z Collecting numpy 2022-09-27T16:39:46.7184048Z Downloading numpy-1.21.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB) 2022-09-27T16:39:47.0516827Z Requirement already satisfied: psutil in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 7)) (5.9.1) 2022-09-27T16:39:47.1743150Z Collecting pyyaml 2022-09-27T16:39:47.1797635Z Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB) 2022-09-27T16:39:47.2012825Z Requirement already satisfied: requests in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 9)) (2.26.0) 2022-09-27T16:39:47.2193220Z Requirement already satisfied: setuptools in /usr/lib/python3.7/site-packages (from -r requirements.txt (line 10)) (49.1.3) 2022-09-27T16:39:47.2801875Z Collecting six 2022-09-27T16:39:47.2854247Z Downloading six-1.16.0-py2.py3-none-any.whl (11 kB) 2022-09-27T16:39:47.3208224Z Collecting types-dataclasses 2022-09-27T16:39:47.3257037Z Downloading types_dataclasses-0.6.6-py3-none-any.whl (2.9 kB) 2022-09-27T16:39:47.3691667Z Collecting typing_extensions 2022-09-27T16:39:47.3742212Z Downloading typing_extensions-4.3.0-py3-none-any.whl (25 kB) 2022-09-27T16:39:47.4307502Z Collecting sympy 2022-09-27T16:39:47.4371974Z Downloading sympy-1.10.1-py3-none-any.whl (6.4 MB) 2022-09-27T16:39:47.6667536Z Collecting wheel<1.0,>=0.23.0 2022-09-27T16:39:47.6711504Z Downloading wheel-0.37.1-py2.py3-none-any.whl (35 kB) 2022-09-27T16:39:47.7096122Z Collecting exceptiongroup>=1.0.0rc8; python_version < "3.11" 2022-09-27T16:39:47.7139523Z Downloading exceptiongroup-1.0.0rc9-py3-none-any.whl (12 kB) 2022-09-27T16:39:47.7576859Z Collecting sortedcontainers<3.0.0,>=2.1.0 2022-09-27T16:39:47.7621700Z Downloading sortedcontainers-2.4.0-py2.py3-none-any.whl (29 kB) 2022-09-27T16:39:47.8149718Z Collecting attrs>=19.2.0 2022-09-27T16:39:47.8257623Z Downloading attrs-22.1.0-py2.py3-none-any.whl (58 kB) 2022-09-27T16:39:47.8726885Z Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 9)) (1.26.12) 2022-09-27T16:39:47.8951957Z Requirement already satisfied: charset-normalizer~=2.0.0; python_version >= "3" in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 9)) (2.0.12) 2022-09-27T16:39:47.8978115Z Requirement already satisfied: certifi>=2017.4.17 in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 9)) (2022.9.24) 2022-09-27T16:39:47.8991375Z Requirement already satisfied: idna<4,>=2.5; python_version >= "3" in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 9)) (3.4) 2022-09-27T16:39:47.9279913Z Collecting mpmath>=0.19 2022-09-27T16:39:47.9395082Z Downloading mpmath-1.2.1-py3-none-any.whl (532 kB) 2022-09-27T16:39:47.9663812Z Using legacy 'setup.py install' for future, since package 'wheel' is not installed. 2022-09-27T16:39:48.1122843Z Installing collected packages: wheel, six, astunparse, expecttest, future, exceptiongroup, sortedcontainers, attrs, hypothesis, numpy, pyyaml, types-dataclasses, typing-extensions, mpmath, sympy 2022-09-27T16:39:48.1404436Z WARNING: The script wheel is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-09-27T16:39:48.1405272Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-09-27T16:39:48.1852468Z Running setup.py install for future: started 2022-09-27T16:39:48.8452255Z Running setup.py install for future: finished with status 'done' 2022-09-27T16:39:49.1520390Z WARNING: The script hypothesis is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-09-27T16:39:49.1521168Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-09-27T16:39:51.1130358Z WARNING: The scripts f2py, f2py3 and f2py3.7 are installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-09-27T16:39:51.1131197Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-09-27T16:39:59.9832011Z WARNING: The script isympy is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-09-27T16:39:59.9832715Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-09-27T16:40:00.0265605Z Successfully installed astunparse-1.6.3 attrs-22.1.0 exceptiongroup-1.0.0rc9 expecttest-0.1.3 future-0.18.2 hypothesis-6.54.6 mpmath-1.2.1 numpy-1.21.6 pyyaml-6.0 six-1.16.0 sortedcontainers-2.4.0 sympy-1.10.1 types-dataclasses-0.6.6 typing-extensions-4.3.0 wheel-0.37.1 2022-09-27T16:40:00.0985837Z + python3 -m pip install boto3==1.19.12 2022-09-27T16:40:00.3909118Z Defaulting to user installation because normal site-packages is not writeable 2022-09-27T16:40:01.3469356Z Collecting boto3==1.19.12 2022-09-27T16:40:01.3669981Z Downloading boto3-1.19.12-py3-none-any.whl (131 kB) 2022-09-27T16:40:01.4299487Z Collecting s3transfer<0.6.0,>=0.5.0 2022-09-27T16:40:01.4377651Z Downloading s3transfer-0.5.2-py3-none-any.whl (79 kB) 2022-09-27T16:40:01.4867498Z Collecting jmespath<1.0.0,>=0.7.1 2022-09-27T16:40:01.4973330Z Downloading jmespath-0.10.0-py2.py3-none-any.whl (24 kB) 2022-09-27T16:40:02.6856440Z Collecting botocore<1.23.0,>=1.22.12 2022-09-27T16:40:02.6927171Z Downloading botocore-1.22.12-py3-none-any.whl (8.1 MB) 2022-09-27T16:40:02.9301171Z Collecting python-dateutil<3.0.0,>=2.1 2022-09-27T16:40:02.9400202Z Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB) 2022-09-27T16:40:02.9579636Z Requirement already satisfied: urllib3<1.27,>=1.25.4 in /home/ec2-user/.local/lib/python3.7/site-packages (from botocore<1.23.0,>=1.22.12->boto3==1.19.12) (1.26.12) 2022-09-27T16:40:02.9799540Z Requirement already satisfied: six>=1.5 in /home/ec2-user/.local/lib/python3.7/site-packages (from python-dateutil<3.0.0,>=2.1->botocore<1.23.0,>=1.22.12->boto3==1.19.12) (1.16.0) 2022-09-27T16:40:03.1592829Z Installing collected packages: python-dateutil, jmespath, botocore, s3transfer, boto3 2022-09-27T16:40:04.0572301Z Successfully installed boto3-1.19.12 botocore-1.22.12 jmespath-0.10.0 python-dateutil-2.8.2 s3transfer-0.5.2 2022-09-27T16:40:04.1106885Z + python3 -m tools.stats.print_test_stats --upload-to-s3 --compare-with-s3 test 2022-09-27T16:40:14.0843993Z [scribe] Scribe access token not provided, sending report via boto3... 2022-09-27T16:40:14.0844270Z 2022-09-27T16:40:14.0848855Z ----- Historic stats comparison result ------ 2022-09-27T16:40:14.0849100Z 2022-09-27T16:40:14.0849758Z job: linux-bionic-cuda11.6-py3.10-gcc7 2022-09-27T16:40:14.0850126Z commit: 52424e2bf38e454d535881fed9628d3e20f4f944 2022-09-27T16:40:14.0850329Z 2022-09-27T16:40:14.0850559Z Commit graph (base is most recent master ancestor with at least one S3 report): 2022-09-27T16:40:14.0854418Z 2022-09-27T16:40:14.0854981Z : (master) 2022-09-27T16:40:14.0855224Z | 2022-09-27T16:40:14.0855520Z | * 52424e2bf3 (HEAD) total time 2720.43s 2022-09-27T16:40:14.0855808Z | | 2022-09-27T16:40:14.0856009Z | : (4 commits) 2022-09-27T16:40:14.0856237Z |/ 2022-09-27T16:40:14.0856985Z * c7c2578f93 (base) 9 reports, total time 3338.43s ± 1882.56s 2022-09-27T16:40:14.0857433Z * 99ad8a3048 9 reports, total time 3402.28s ± 1860.32s 2022-09-27T16:40:14.0857840Z * 34296e2f4c 9 reports, total time 3340.35s ± 1886.81s 2022-09-27T16:40:14.0858262Z * 4523ac7aa1 9 reports, total time 3366.30s ± 1853.68s 2022-09-27T16:40:14.0858687Z * f21e77d9a6 9 reports, total time 3413.29s ± 1861.08s 2022-09-27T16:40:14.0859087Z * 26a861cb27 9 reports, total time 3325.52s ± 1844.86s 2022-09-27T16:40:14.0859517Z * 56a41b5998 9 reports, total time 3452.70s ± 1976.19s 2022-09-27T16:40:14.0859966Z * 1910c5847e 9 reports, total time 3444.18s ± 2043.46s 2022-09-27T16:40:14.0860391Z * caa0ab557d 9 reports, total time 3312.06s ± 1824.07s 2022-09-27T16:40:14.0860977Z * 0336308be5 0 reports 2022-09-27T16:40:14.0861218Z | 2022-09-27T16:40:14.0861425Z : 2022-09-27T16:40:14.0861543Z 2022-09-27T16:40:14.0861709Z Removed (across 894 suites) 0 tests, totaling 0.00s 2022-09-27T16:40:14.0862064Z Modified (across 0 suites) 0 tests, totaling 0.00s 2022-09-27T16:40:14.0862411Z Added (across 69 suites) 651 tests, totaling +2720.43s 2022-09-27T16:40:14.1398361Z Prepare all required actions 2022-09-27T16:40:14.1444280Z ##[group]Run ./.github/actions/teardown-linux 2022-09-27T16:40:14.1444566Z with: 2022-09-27T16:40:14.1444758Z env: 2022-09-27T16:40:14.1444995Z GIT_DEFAULT_BRANCH: master 2022-09-27T16:40:14.1445260Z GPU_FLAG: --gpus all 2022-09-27T16:40:14.1445487Z ##[endgroup] 2022-09-27T16:40:14.1464438Z ##[group]Run set -eou pipefail 2022-09-27T16:40:14.1464740Z set -eou pipefail 2022-09-27T16:40:14.1464993Z  2022-09-27T16:40:14.1465311Z echo "Holding runner for 2 hours until all ssh sessions have logged out" 2022-09-27T16:40:14.1465648Z for _ in $(seq 1440); do 2022-09-27T16:40:14.1465943Z  # Break if no ssh session exists anymore 2022-09-27T16:40:14.1466241Z  if [ "$(who)" = "" ]; then 2022-09-27T16:40:14.1466480Z  break 2022-09-27T16:40:14.1466713Z  fi 2022-09-27T16:40:14.1466946Z  echo "." 2022-09-27T16:40:14.1467207Z  sleep 5 2022-09-27T16:40:14.1467444Z done 2022-09-27T16:40:14.1481049Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-09-27T16:40:14.1481344Z env: 2022-09-27T16:40:14.1481573Z GIT_DEFAULT_BRANCH: master 2022-09-27T16:40:14.1481821Z GPU_FLAG: --gpus all 2022-09-27T16:40:14.1482067Z ##[endgroup] 2022-09-27T16:40:14.1511128Z Holding runner for 2 hours until all ssh sessions have logged out 2022-09-27T16:40:14.1580242Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2022-09-27T16:40:14.1580673Z # ignore expansion of "docker ps -q" since it could be empty 2022-09-27T16:40:14.1581015Z # shellcheck disable=SC2046 2022-09-27T16:40:14.1581300Z docker stop $(docker ps -q) || true 2022-09-27T16:40:14.1581610Z # Prune all of the docker images 2022-09-27T16:40:14.1581906Z docker system prune -af 2022-09-27T16:40:14.1594308Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-09-27T16:40:14.1594611Z env: 2022-09-27T16:40:14.1594850Z GIT_DEFAULT_BRANCH: master 2022-09-27T16:40:14.1595098Z GPU_FLAG: --gpus all 2022-09-27T16:40:14.1595345Z ##[endgroup] 2022-09-27T16:40:15.1779111Z ac37d1fee4fc 2022-09-27T16:40:15.9249197Z Deleted Containers: 2022-09-27T16:40:15.9249586Z ac37d1fee4fc78027b66bf2dbe82cdb150df17552519faa479ea1f0aad2016f1 2022-09-27T16:40:15.9249850Z 2022-09-27T16:40:20.9587496Z Deleted Images: 2022-09-27T16:40:20.9588359Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:e66cf5fa0a4d4ed512901b12ccdab95cca946a29 2022-09-27T16:40:20.9589369Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7@sha256:9bb261bc4d8aeb82a71b1f0709da9c979e85a12a79c4a85c3fe3adddddcb2663 2022-09-27T16:40:20.9589999Z deleted: sha256:1565775a6d0c052a41180f67487ffe62db8903a6db8b459487e830a767b885e7 2022-09-27T16:40:20.9590449Z deleted: sha256:3e4cb2c2b5f9e2c80e23a8d896fa974adc4b0e3a54cc06c0a8afb922880fdac8 2022-09-27T16:40:20.9591203Z deleted: sha256:d0ad3421a88e79ceec8792dd7f305a7f9da57daa0119b35d0a37114fd2a8dcf9 2022-09-27T16:40:20.9591685Z deleted: sha256:701e765e83a6966eae0a6dd8fc686dd1787c602cf538ef7dea4100368068fcc9 2022-09-27T16:40:20.9592117Z deleted: sha256:8e85fd60215cbc7be6e9f6423e1f87f0e07c0c672606d79f499dbc625e3eda75 2022-09-27T16:40:20.9592548Z deleted: sha256:e5bde08e611f82f97b690c9fa678975e18d51c1d1bc1c8ddef0588f1a022d639 2022-09-27T16:40:20.9592967Z deleted: sha256:68800e2bce8407a9b0b64d467c217a2bb27e85fd2f3ab0c5793759d5443bf962 2022-09-27T16:40:20.9593638Z deleted: sha256:2457e2e8215ddc219f679a7f957fdc3c639147fc569a0cc1bc9f3a3a97ecd0b2 2022-09-27T16:40:20.9594081Z deleted: sha256:b9c61d9be1eb46057ce27c51ac051ba0ba53d440bda828612cfc06ae78352d7a 2022-09-27T16:40:20.9594524Z deleted: sha256:73dd107fbbff1e853191e9dcb9b75847329af3c80a5883c3ae039e14fa4caa0a 2022-09-27T16:40:20.9594950Z deleted: sha256:fe97c2c62ffb15e65ed2751c4c9069d62e91f204ff1c30980d18680617c5ae40 2022-09-27T16:40:20.9595468Z deleted: sha256:3393d072706e15983b6dba8491dcfaf03de10df9be7d9df9a80ef0dedf384b3e 2022-09-27T16:40:20.9595903Z deleted: sha256:328eaea416a7110b96dd1c4421ae76dac29d151937f13f647291840207613cfd 2022-09-27T16:40:20.9596313Z deleted: sha256:1f62c29a7ac809a8add06b0bc14387e38ec12e88034824daccfdd78a0ee24d00 2022-09-27T16:40:20.9596752Z deleted: sha256:a364f929327c0e6baf5da371c2e514c56a7dc2fcbd3f7f2229d49bf2fb3a2f27 2022-09-27T16:40:20.9597188Z deleted: sha256:14a20e2c555e8aea13a33ca9c7d20333ce850c23d3a6726d7b102a426c2bb100 2022-09-27T16:40:20.9597623Z deleted: sha256:865844d2dbf2eb79ca92ace32b76fd966b142ecccd91a6190a256f4d278fed74 2022-09-27T16:40:20.9598033Z deleted: sha256:6a56212e72bab94d436ef418303481e278e79a012bf88cc727231093221416e7 2022-09-27T16:40:20.9598442Z deleted: sha256:8ff11e3f91c81bfe67a7da7b03306cdc936f730f44e7dbc401299c81679f3a20 2022-09-27T16:40:20.9598860Z deleted: sha256:e3ebc0052e1300157e36aa327172ad1098717918585d1494c4efc4afdbb8ecbe 2022-09-27T16:40:20.9599278Z deleted: sha256:4b7db58c33bbdbe2e491f721c5a9724ef07d9a720d77bd517c1f5c8fad212ba2 2022-09-27T16:40:20.9599729Z deleted: sha256:1e68b2d3e3c7bf741f5f060b3a88095db0d5eefd841ce4e86b3c466f882997e6 2022-09-27T16:40:20.9600173Z deleted: sha256:4da2a66a2b27b034b75aa5a024cf7b8cfe1f8762b0725921f8b00cb3a0505759 2022-09-27T16:40:20.9600615Z deleted: sha256:37c6c713aea14d8a4a5f0dfd63f80f04633b0d6f8e7baed0a1feba47d709cbeb 2022-09-27T16:40:20.9601033Z deleted: sha256:c636e275b4e8c2d9022e72ee8d8528006ef92ad2fab903ba909244c2b9aa4bb0 2022-09-27T16:40:20.9601488Z deleted: sha256:2b912a242a1b69ebde3bf40dfab4b693cfa93f9bcb0d8be6c698b44b4284a70d 2022-09-27T16:40:20.9601931Z deleted: sha256:e4ff425d4caf55cc2f5939271ab53c4382b45c5454ccb0e4fa62cbb730aa8ba7 2022-09-27T16:40:20.9602379Z deleted: sha256:f80b712f0269de04dedd4bb68f0a2103eb0ac5bb70e5da74a7fe0544c8a678af 2022-09-27T16:40:20.9602809Z deleted: sha256:c048927fdfe44aec1063f37c0137ad63b7537c23dd3051ad691232ce363048fa 2022-09-27T16:40:20.9603287Z deleted: sha256:85d0da5e4c90646a2049cf17e470deedd05b19ab1535e4932940055fc36e1c91 2022-09-27T16:40:20.9604116Z deleted: sha256:a94964edf2ce4a0b440f5d000d2ea97c6145adcb1ad201ade6214f02cd47626c 2022-09-27T16:40:20.9604901Z deleted: sha256:fc49e4f76476298cd45a4ef31767534bd2b6663c90683ccbb4d911ea77a93d56 2022-09-27T16:40:20.9605886Z deleted: sha256:25a90c74f1ddefff551a5575bbaeaf1fd45c71d629ef061f41a227f2813ebf62 2022-09-27T16:40:20.9606426Z deleted: sha256:9c67a539cd718b76768a481e6313606489c7afbdd2ed5f0d35f94a0c8161ff59 2022-09-27T16:40:20.9606837Z deleted: sha256:78cfd7e90437d18d294dd32f35632e1e0f4e93f98d2f74562b7a2a483d89e847 2022-09-27T16:40:20.9607265Z deleted: sha256:feccb37688a9f8b307ab298febd901ccc8920a2ec8e1660a0eb8de5b6e41a3b1 2022-09-27T16:40:20.9607696Z deleted: sha256:eeded198b25e05f7ca35680574d2863e48240eb544a4945d562943ed7c519eaa 2022-09-27T16:40:20.9608120Z deleted: sha256:eba0a8e367727026cab4d61ce7412eac92df541fa66418907501f777a611d01a 2022-09-27T16:40:20.9608528Z deleted: sha256:ab3963854dae2a8321e764d73ecfb89e77618f391cbe2698ab09c8129a35eb29 2022-09-27T16:40:20.9608942Z deleted: sha256:db198971615e367404b8c928ff87c4bc7e81b060fa78a8431f2124d37e546a69 2022-09-27T16:40:20.9609352Z deleted: sha256:116af78f4b87e494b37b6a65d9e3abdd828764632b0adf869c40db8a8afb7745 2022-09-27T16:40:20.9609772Z deleted: sha256:21479387e93277b4c3c8df1cbd39208f3c6aa4128b6f0eebaafa7e3b82d8a23c 2022-09-27T16:40:20.9610184Z deleted: sha256:5731c54a72f20a93dd164dc484ecd3fd0bf0f0cb98eec92e2382f99c0c43e1a6 2022-09-27T16:40:20.9610638Z deleted: sha256:8b6f381bcc3be3c07a12a8e528c8926c369875dccb5d18323007a3beb34cdf52 2022-09-27T16:40:20.9611177Z deleted: sha256:07dcc550108cf9d433a453ce3e351ac30e67c1ccbc5f7bc71b408e2fb4bcc6a9 2022-09-27T16:40:20.9611590Z deleted: sha256:129bdb873e79117f4e90135f0c6a58f775fcf596f4eb514b803771cef2da8278 2022-09-27T16:40:20.9612025Z deleted: sha256:2d49e3a81bd436bfd20fb4a849cdc98da82cb74afef3de38dda7a946d3fc4153 2022-09-27T16:40:20.9612473Z deleted: sha256:0ba4e259108e5311ddf6b79ae3a35f8f16a4004ef8817e50427baa3cc90ac081 2022-09-27T16:40:20.9612970Z deleted: sha256:c164403226561914f16becdeca65c54d20dba8dad414b062efc34c05c47bf725 2022-09-27T16:40:20.9613385Z deleted: sha256:cbe4006b2e6286d50c1b292fb71b69d5299d65f055285519eafc41eac3ef8a3c 2022-09-27T16:40:20.9613808Z deleted: sha256:edcec18dceb25f1a03ec20de4676464613e69072875a83f5c45e45a31aafc5b9 2022-09-27T16:40:20.9614226Z deleted: sha256:13c4f317ac4bb48997302756b8d5f8b602e835607c9806a1a5b200e9a0657d8a 2022-09-27T16:40:20.9614614Z deleted: sha256:57f043e380f4586c76968d6e062b50bac55254a5be7e80bea3c027a5bb316469 2022-09-27T16:40:20.9615023Z deleted: sha256:3e549931e0240b9aac25dc79ed6a6259863879a5c9bd20755f77cac27c1ab8c8 2022-09-27T16:40:20.9615267Z 2022-09-27T16:40:20.9690919Z Total reclaimed space: 19.33GB 2022-09-27T16:40:20.9753276Z Post job cleanup. 2022-09-27T16:40:20.9792628Z Post job cleanup. 2022-09-27T16:40:21.1140737Z [command]/usr/bin/git version 2022-09-27T16:40:21.1191148Z git version 2.37.1 2022-09-27T16:40:21.1257648Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/e78c24e8-62fc-4763-b224-68d1e8c56ddb' before making global git config changes 2022-09-27T16:40:21.1258232Z Adding repository directory to the temporary git global config as a safe directory 2022-09-27T16:40:21.1267156Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-09-27T16:40:21.1316729Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2022-09-27T16:40:21.1356565Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || : 2022-09-27T16:40:21.1674735Z Entering 'android/libs/fbjni' 2022-09-27T16:40:21.1715756Z Entering 'third_party/FP16' 2022-09-27T16:40:21.1759129Z Entering 'third_party/FXdiv' 2022-09-27T16:40:21.1798107Z Entering 'third_party/NNPACK' 2022-09-27T16:40:21.1838814Z Entering 'third_party/QNNPACK' 2022-09-27T16:40:21.1880713Z Entering 'third_party/VulkanMemoryAllocator' 2022-09-27T16:40:21.1921205Z Entering 'third_party/XNNPACK' 2022-09-27T16:40:21.1973928Z Entering 'third_party/benchmark' 2022-09-27T16:40:21.2014519Z Entering 'third_party/cpuinfo' 2022-09-27T16:40:21.2055908Z Entering 'third_party/cub' 2022-09-27T16:40:21.2097163Z Entering 'third_party/cudnn_frontend' 2022-09-27T16:40:21.2144345Z Entering 'third_party/cutlass' 2022-09-27T16:40:21.2193556Z Entering 'third_party/eigen' 2022-09-27T16:40:21.2237773Z Entering 'third_party/fbgemm' 2022-09-27T16:40:21.2279160Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-09-27T16:40:21.2322565Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-09-27T16:40:21.2363208Z Entering 'third_party/fbgemm/third_party/googletest' 2022-09-27T16:40:21.2404093Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-09-27T16:40:21.2445702Z Entering 'third_party/flatbuffers' 2022-09-27T16:40:21.2490300Z Entering 'third_party/fmt' 2022-09-27T16:40:21.2532335Z Entering 'third_party/foxi' 2022-09-27T16:40:21.2574005Z Entering 'third_party/gemmlowp/gemmlowp' 2022-09-27T16:40:21.2615231Z Entering 'third_party/gloo' 2022-09-27T16:40:21.2657050Z Entering 'third_party/googletest' 2022-09-27T16:40:21.2698663Z Entering 'third_party/ideep' 2022-09-27T16:40:21.2739993Z Entering 'third_party/ideep/mkl-dnn' 2022-09-27T16:40:21.2785610Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-09-27T16:40:21.2834037Z Entering 'third_party/ios-cmake' 2022-09-27T16:40:21.2876506Z Entering 'third_party/ittapi' 2022-09-27T16:40:21.2917978Z Entering 'third_party/kineto' 2022-09-27T16:40:21.2958547Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-09-27T16:40:21.2999228Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-09-27T16:40:21.3041717Z Entering 'third_party/nccl/nccl' 2022-09-27T16:40:21.3083830Z Entering 'third_party/neon2sse' 2022-09-27T16:40:21.3123697Z Entering 'third_party/nlohmann' 2022-09-27T16:40:21.3166689Z Entering 'third_party/onnx' 2022-09-27T16:40:21.3222598Z Entering 'third_party/onnx/third_party/benchmark' 2022-09-27T16:40:21.3265308Z Entering 'third_party/onnx/third_party/pybind11' 2022-09-27T16:40:21.3308755Z Entering 'third_party/onnx-tensorrt' 2022-09-27T16:40:21.3349476Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-09-27T16:40:21.3396845Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-09-27T16:40:21.3439511Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-09-27T16:40:21.3480858Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-09-27T16:40:21.3527197Z Entering 'third_party/pocketfft' 2022-09-27T16:40:21.3569154Z Entering 'third_party/protobuf' 2022-09-27T16:40:21.3615022Z Entering 'third_party/protobuf/third_party/benchmark' 2022-09-27T16:40:21.3655750Z Entering 'third_party/protobuf/third_party/googletest' 2022-09-27T16:40:21.3698024Z Entering 'third_party/psimd' 2022-09-27T16:40:21.3739274Z Entering 'third_party/pthreadpool' 2022-09-27T16:40:21.3781440Z Entering 'third_party/pybind11' 2022-09-27T16:40:21.3823180Z Entering 'third_party/python-enum' 2022-09-27T16:40:21.3864183Z Entering 'third_party/python-peachpy' 2022-09-27T16:40:21.3905162Z Entering 'third_party/python-six' 2022-09-27T16:40:21.3946081Z Entering 'third_party/sleef' 2022-09-27T16:40:21.3988298Z Entering 'third_party/tbb' 2022-09-27T16:40:21.4033060Z Entering 'third_party/tensorpipe' 2022-09-27T16:40:21.4075038Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-09-27T16:40:21.4116564Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-09-27T16:40:21.4157636Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-09-27T16:40:21.4199287Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-09-27T16:40:21.4239204Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-09-27T16:40:21.4283584Z Entering 'third_party/zstd' 2022-09-27T16:40:21.4345398Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2022-09-27T16:40:21.4374435Z http.https://github.com/.extraheader 2022-09-27T16:40:21.4385717Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2022-09-27T16:40:21.4424895Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || : 2022-09-27T16:40:21.4734927Z Entering 'android/libs/fbjni' 2022-09-27T16:40:21.4759013Z http.https://github.com/.extraheader 2022-09-27T16:40:21.4790659Z Entering 'third_party/FP16' 2022-09-27T16:40:21.4815601Z http.https://github.com/.extraheader 2022-09-27T16:40:21.4847859Z Entering 'third_party/FXdiv' 2022-09-27T16:40:21.4872577Z http.https://github.com/.extraheader 2022-09-27T16:40:21.4904937Z Entering 'third_party/NNPACK' 2022-09-27T16:40:21.4929364Z http.https://github.com/.extraheader 2022-09-27T16:40:21.4961649Z Entering 'third_party/QNNPACK' 2022-09-27T16:40:21.4985705Z http.https://github.com/.extraheader 2022-09-27T16:40:21.5017654Z Entering 'third_party/VulkanMemoryAllocator' 2022-09-27T16:40:21.5042332Z http.https://github.com/.extraheader 2022-09-27T16:40:21.5074582Z Entering 'third_party/XNNPACK' 2022-09-27T16:40:21.5098207Z http.https://github.com/.extraheader 2022-09-27T16:40:21.5141184Z Entering 'third_party/benchmark' 2022-09-27T16:40:21.5165965Z http.https://github.com/.extraheader 2022-09-27T16:40:21.5198709Z Entering 'third_party/cpuinfo' 2022-09-27T16:40:21.5223454Z http.https://github.com/.extraheader 2022-09-27T16:40:21.5256197Z Entering 'third_party/cub' 2022-09-27T16:40:21.5280603Z http.https://github.com/.extraheader 2022-09-27T16:40:21.5312562Z Entering 'third_party/cudnn_frontend' 2022-09-27T16:40:21.5337253Z http.https://github.com/.extraheader 2022-09-27T16:40:21.5375120Z Entering 'third_party/cutlass' 2022-09-27T16:40:21.5399937Z http.https://github.com/.extraheader 2022-09-27T16:40:21.5439206Z Entering 'third_party/eigen' 2022-09-27T16:40:21.5462786Z http.https://github.com/.extraheader 2022-09-27T16:40:21.5498622Z Entering 'third_party/fbgemm' 2022-09-27T16:40:21.5523458Z http.https://github.com/.extraheader 2022-09-27T16:40:21.5555206Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-09-27T16:40:21.5578900Z http.https://github.com/.extraheader 2022-09-27T16:40:21.5611838Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-09-27T16:40:21.5636127Z http.https://github.com/.extraheader 2022-09-27T16:40:21.5667864Z Entering 'third_party/fbgemm/third_party/googletest' 2022-09-27T16:40:21.5693236Z http.https://github.com/.extraheader 2022-09-27T16:40:21.5725417Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-09-27T16:40:21.5749118Z http.https://github.com/.extraheader 2022-09-27T16:40:21.5783473Z Entering 'third_party/flatbuffers' 2022-09-27T16:40:21.5807670Z http.https://github.com/.extraheader 2022-09-27T16:40:21.5841344Z Entering 'third_party/fmt' 2022-09-27T16:40:21.5864956Z http.https://github.com/.extraheader 2022-09-27T16:40:21.5897313Z Entering 'third_party/foxi' 2022-09-27T16:40:21.5921888Z http.https://github.com/.extraheader 2022-09-27T16:40:21.5953521Z Entering 'third_party/gemmlowp/gemmlowp' 2022-09-27T16:40:21.5978480Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6011113Z Entering 'third_party/gloo' 2022-09-27T16:40:21.6034932Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6066647Z Entering 'third_party/googletest' 2022-09-27T16:40:21.6091744Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6123343Z Entering 'third_party/ideep' 2022-09-27T16:40:21.6147092Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6178464Z Entering 'third_party/ideep/mkl-dnn' 2022-09-27T16:40:21.6202909Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6236354Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-09-27T16:40:21.6259985Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6300180Z Entering 'third_party/ios-cmake' 2022-09-27T16:40:21.6325169Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6357335Z Entering 'third_party/ittapi' 2022-09-27T16:40:21.6380735Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6412364Z Entering 'third_party/kineto' 2022-09-27T16:40:21.6436283Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6467373Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-09-27T16:40:21.6491600Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6523543Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-09-27T16:40:21.6546976Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6580859Z Entering 'third_party/nccl/nccl' 2022-09-27T16:40:21.6605431Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6637133Z Entering 'third_party/neon2sse' 2022-09-27T16:40:21.6660941Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6692572Z Entering 'third_party/nlohmann' 2022-09-27T16:40:21.6717702Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6750982Z Entering 'third_party/onnx' 2022-09-27T16:40:21.6776287Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6821165Z Entering 'third_party/onnx/third_party/benchmark' 2022-09-27T16:40:21.6846731Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6879909Z Entering 'third_party/onnx/third_party/pybind11' 2022-09-27T16:40:21.6903995Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6938783Z Entering 'third_party/onnx-tensorrt' 2022-09-27T16:40:21.6963296Z http.https://github.com/.extraheader 2022-09-27T16:40:21.6995165Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-09-27T16:40:21.7019663Z http.https://github.com/.extraheader 2022-09-27T16:40:21.7057313Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-09-27T16:40:21.7081947Z http.https://github.com/.extraheader 2022-09-27T16:40:21.7114532Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-09-27T16:40:21.7138557Z http.https://github.com/.extraheader 2022-09-27T16:40:21.7170569Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-09-27T16:40:21.7194804Z http.https://github.com/.extraheader 2022-09-27T16:40:21.7231708Z Entering 'third_party/pocketfft' 2022-09-27T16:40:21.7255686Z http.https://github.com/.extraheader 2022-09-27T16:40:21.7286996Z Entering 'third_party/protobuf' 2022-09-27T16:40:21.7311110Z http.https://github.com/.extraheader 2022-09-27T16:40:21.7346573Z Entering 'third_party/protobuf/third_party/benchmark' 2022-09-27T16:40:21.7372163Z http.https://github.com/.extraheader 2022-09-27T16:40:21.7404581Z Entering 'third_party/protobuf/third_party/googletest' 2022-09-27T16:40:21.7428196Z http.https://github.com/.extraheader 2022-09-27T16:40:21.7462481Z Entering 'third_party/psimd' 2022-09-27T16:40:21.7488879Z http.https://github.com/.extraheader 2022-09-27T16:40:21.7522668Z Entering 'third_party/pthreadpool' 2022-09-27T16:40:21.7546669Z http.https://github.com/.extraheader 2022-09-27T16:40:21.7578268Z Entering 'third_party/pybind11' 2022-09-27T16:40:21.7603383Z http.https://github.com/.extraheader 2022-09-27T16:40:21.7635850Z Entering 'third_party/python-enum' 2022-09-27T16:40:21.7659990Z http.https://github.com/.extraheader 2022-09-27T16:40:21.7692447Z Entering 'third_party/python-peachpy' 2022-09-27T16:40:21.7716849Z http.https://github.com/.extraheader 2022-09-27T16:40:21.7748418Z Entering 'third_party/python-six' 2022-09-27T16:40:21.7773171Z http.https://github.com/.extraheader 2022-09-27T16:40:21.7804463Z Entering 'third_party/sleef' 2022-09-27T16:40:21.7828466Z http.https://github.com/.extraheader 2022-09-27T16:40:21.7860906Z Entering 'third_party/tbb' 2022-09-27T16:40:21.7885439Z http.https://github.com/.extraheader 2022-09-27T16:40:21.7919872Z Entering 'third_party/tensorpipe' 2022-09-27T16:40:21.7944487Z http.https://github.com/.extraheader 2022-09-27T16:40:21.7976643Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-09-27T16:40:21.8000587Z http.https://github.com/.extraheader 2022-09-27T16:40:21.8032150Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-09-27T16:40:21.8055614Z http.https://github.com/.extraheader 2022-09-27T16:40:21.8087215Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-09-27T16:40:21.8110924Z http.https://github.com/.extraheader 2022-09-27T16:40:21.8142833Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-09-27T16:40:21.8166779Z http.https://github.com/.extraheader 2022-09-27T16:40:21.8199053Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-09-27T16:40:21.8222665Z http.https://github.com/.extraheader 2022-09-27T16:40:21.8258290Z Entering 'third_party/zstd' 2022-09-27T16:40:21.8282923Z http.https://github.com/.extraheader 2022-09-27T16:40:21.8581338Z Cleaning up orphan processes